Sunday, 31 October 2021

Can people tell the difference between bottled water and tap water?

One of the funniest papers I have ever read is Can People Distinguish Pâté from Dog Food? (ungated earlier version here). The answer was no, people can't tell the difference. And it turns out that isn't the only example where people can't tell the difference. For example, in blind taste tests most people can't tell the difference between Coke and Pepsi (see here for more on that).

On a related note, this 2018 article (ungated here) by Kevin Capehart (California State University) and Elena Berg (American University of Paris), published in the Journal of Wine Economics (yes, there is such a journal), presents the results of a blind taste test of water. Capehart and Berg ran three experiments with 188 research participants (who were undergraduates at the American University of Paris). First, they gave their participants a brief training in water tasting based on a video of a water sommelier. And yes, there is such an occupation - see here:

Anyway, after their training, the research participants completed the three experiments. In the first experiment, they were given four sets of three glasses of water. Each set contained two glasses of the same bottled water, and one that was different. The research participants' task was to identify the glass that was different. How did they do? Capehart and Berg note that:

Although our subjects were better than chance at identifying the singletons, the singleton was correctly identified less than half of the time for each triangle test, including the last test with the biggest [total dissolved solids] difference... Moreover, our subjects correctly identified the singleton in less than half of the triangle tests—about 1.8 out of the four—on average... Those results suggest our subjects were only slightly better than chance at distinguishing bottled waters.

Ok, so far so not good. In the second experiment, the research participants were given five glasses of water - the same four bottle water types from the first experiment, plus tap water. They were asked to rate each water on a 14-point scale, and to rank them from best to worst. What happened? Capehart and Berg note that:

The Fiji brand of bottled water was given the most first-place rankings with a quarter of participants ranking it as their most preferred. Tap water was given the most last-place rankings with 29% ranking it as their least preferred. However, a quarter of participants also said that Fiji was their least preferred, and almost 20% said that tap water was their most preferred. Thus, there is no clear consensus about which waters are preferable to others.

And, in case you thought that the price of bottled water is a signal of quality, if you exclude tap water (which has a price of nearly zero):

...there is no correlation or perhaps even a negative correlation between the price of a bottled water and its rating.

Finally, in the third experiment, participants were given the same five water samples (randomised in order of course), and five descriptions. Four of the descriptions were taken from the water menu at Ray’s and Stark Bar in Los Angeles, and the fifth was simply "tap water". Their task was to match the descriptions with the samples. How did they do? Capehart and Berg note that:

For each water, the participants were not significantly better than chance at matching the water to its description, except for Acqua Panna, but even in that case, less than 30% of participants were able to correctly match it to its description.

The fact that only 24% of participants correctly identified the tap water also means that 76% mistook a bottled water for tap water. Speyside Glenlivet, Hildon, Acqua Panna, and Fiji were mistaken for tap 28%, 27%, 24%, and 18% of the time, respectively.

What do we learn from this? People can't really tell bottled water and tap water apart. If someone buys expensive bottled water, they are engaging in some conspicuous consumption rather than genuinely purchasing a higher quality product. As Capehart and Berg conclude:

Just as there is more to a wine than the look, smell, or taste of what is inside its bottle, there must be more to bottled waters than what is inside, especially since there are no visual differences among still waters, no odor differences, and subtle or non-existent taste differences. Consumers’ willingness to pay for an expensive bottled water must be rooted in other aspects besides the taste of the water inside it.

Thursday, 28 October 2021

Gender differences in answering in the PISA mathematics test

I've previously written a couple of posts on gender differences in multiple choice answering (see here and here). The key points of the research I highlighted there is that male students perform relatively better in multiple choice questions than female students. However, a good answer as to why male students perform better still eludes us. Here's what I said in my 2019 post:

Maybe it is that female students don't respond well to high pressure or competitive situations (and multiple choice questions are reasonably high pressure). Women are more risk averse than men, so maybe it is related to that? Female students might be more likely to skip questions in order to avoid the risk of losing marks... Also, men are more overconfident than women, so maybe it is related to that? Again, male students might be less likely to skip questions, because they are less likely to be unsure they have the right answer.

So, with that question still open, I was interested to read this job market paper by Silvia Griselda (Bocconi University) on the gender gap in mathematics. Griselda used the PISA tests from 2012 and 2015, which included around 500,000 students in total, from over 60 countries. Interestingly, I hadn't realised that the PISA test formats varied, with different students facing different numbers of multiple choice, closed response questions (which require a very simple, short answer, e.g. a single number), and open response questions (which require a more detailed answer, e.g. an explanation of how the answer was derived). So, Griselda compares the performance on the test between male and female students, based on the proportion of mathematics questions that were multiple choice. She first notes that:

Both in the PISA 2012 and 2015 tests, boys perform better than girls in all formats of mathematics questions. Yet, the gender difference in performance is significantly bigger in multiple-choice questions.

Then in her main analysis, Griselda finds that: increase in the proportion of multiple-choice questions by 10 percentage points differentially reduces girls' scores by 0.031 standard deviations compared to boys in 2012, and by 0.021 in 2015. The effect of multiple-choice questions on the gender gap in performance is not small... This effect is comparable to a decrease in teacher quality of one-quarter of a standard deviation... or an increase in a class size of one student...

There also appear to be spillovers, whereby female students also perform worse on both closed response and open response questions, when more of the mathematics questions are multiple choice. She then further exploits the data from the 2015 PISA, which was computerised and recorded the time that each student spent on each question. Categorising students as 'inattentive' if they spent too little time on three or more questions (defined as being shorter than 10 percent of the time taken on average for that question in the student's country), and/or if they skipped three or more questions (despite having five or more minutes left when they finished the test). She finds that:

...boys are significantly more likely than girls to be identified as inattentive students (the proportion of inattentive boys is 9.38%, while the proportion of inattentive girls is 8.51%, and the difference is statistically different from zero)...

...when the proportion of multiple-choice question features [sic] in the exam increases, girls become differentially more disengaged than boys... This means that a 10 percentage point increase in the proportion of multiple-choice questions can reverse the gender gap in student engagement level.

So, when there are more multiple choice questions, female students are more disengaged (inattentive) and that explains their worse performance in multiple choice. Griselda takes things further though, showing that:

The proportion of multiple-choice questions received has a negative and significant effect on the gender difference in performance among students with a low level of confidence and self-efficacy. On the contrary, there is not a significantly different effect of the proportion of multiple-choice questions receive among high confident [sic] and high self-efficacy students...

The proportion of multiple-choice questions has a negative and significant effect on female performance only among students whose mother is not employed in STEM-related occupations. The marginal effect of the proportion of multiple-choice is not statistically significant among students whose mothers work in STEM-related occupations.

That would seem to support the idea of stereotype threat, as noted in my 2019 post. However, there is one problem that I see with this paper, and that is that the results only seem to hold for mathematics, and not for science (or for reading). If multiple choice is a problem for female students more than male students, then that difference should be apparent for all domains, and yet Griselda finds:

The proportion of multiple-choice questions in reading does not affect males and females performance, while the proportion of science multiple-choice question has an unclear effect on students' performance in science.

I would have been interested to see the same sort of analysis based on mothers in STEM-related occupations, for the science results (comparable results for confidence wouldn't be able to be done, as only confidence with mathematics was asked in 2015).

So, these results again demonstrate that male students perform better than female students in multiple choice questions. They provide further suggestive evidence that this might relate to confidence and stereotype threat. However, given that they appear to hold for mathematics and not for science, they are not conclusive to me. We need more research like this, especially trying to identify the mechanisms that are driving the headline differences. Without understanding the mechanisms, we won't be able to adequately address the problem.

[HT: This article by Griselda in The Conversation, back in May]

Read more:

Wednesday, 27 October 2021

The pandemic may have revealed all we need to know about online learning

Regular readers of this blog will know that I am highly sceptical of online learning, blended learning, flipped classrooms, and the like. That's come from a nuanced understanding of the research literature, and especially from a concern about heterogeneity. Students respond differently to learning in the online environment, and in ways that I believe are unhelpful. Students who are motivated and engaged and/or have a high level of 'self-regulation' perform at least as well in online learning as they do in a traditional face-to-face setting, and sometime perform better. Students who lack motivation, are disengaged, and/or have a low level of self-regulation flounder in online learning, and perform much worse.

The problem with much of the research literature, though, is a lack of randomisation. Even when a particular study employs randomisation, randomisation into online learning occurs at the level of the student, not the level of the section or course. That is, particular lecturers opt in to being part of the study (often, they are the researchers who are undertaking the study).

An alternative to a pure, randomised experiment is a natural experiment - where some unexpected change in a real world setting provides a way of comparing those in online and traditional face-to-face learning. That's where the pandemic comes in. Prior to lockdowns and stay-at-home orders prevented in-person teaching, some students were studying online. Other students were studying in person, but were forced into online learning. Comparing the two groups can give us some idea of the effect of online learning on student performance, and there are a number of studies starting to appear that do just that. I'm going to focus this post on four such studies.

The first study is this NBER working paper (ungated) by Duha Altindag (Auburn University), Elif Filiz (University of Southern Mississippi), and Erdal Tekin (American University). Altindag was one of the co-authors on the article I discussed on Sunday. Their data come from a "medium-sized, public R1 university" (probably Auburn University), and includes a sample of over 18,000 students and over 1000 instructors. They essentially compare student performance in classes in Spring and Fall 2019 with the same students' performance in classes in Spring 2020, where pandemic restrictions shut the campus down partway through the semester, forcing all in-person teaching to online. Importantly:

This shift occurred after the midterm grades were assigned. Therefore, students obtained a set of midterm grades with F2F [face-to-face] instruction and another set (Final Grades) after the switch to online instruction.

Altindag et al. find that, once they account for heterogeneity across instructors:

...students in F2F instruction are 2.4 percentage points (69% of the mean of the online classes) less likely to withdraw from a course than those in online instruction in Fall 2019... Moreover, students in F2F courses are 4.1 percentage points (4 percent) more likely to receive a passing grade, i.e., A, B, C, or D, than their counterparts in online courses.

However, importantly, Altindag et al. go on to look at heterogeneous effects for different student groups, and find that:

Strikingly, for honor students, there seems to be no difference between online and F2F instruction... Students in the Honors program perform equally well regardless of whether the course is offered online or in person... When we turn to students in regular courses, however, the results are very different and resembles the earlier pattern that we discussed in the previous results...

So, the negative impacts of online learning were concentrated among non-honours students, as I suggested at the start of this post. Better students are not advantaged by online learning in an absolute sense, but they are advantaged relatively because the less-able students do much worse in an online setting. Also interestingly, in this study there were no statistically significant differences in the impact of online learning by gender or race. However, they also show some suggestive evidence that having access to better broadband internet reduces the negative impact of online learning (which should not be surprising), but doesn't eliminate it.

Altindag et al. also show that the negative impact of online learning was concentrated in courses where instructors were more vigilant about academic integrity and cheating, which suggests that we should be cautious about taking for granted that grades in an online setting are always a good measure of student learning. 

The second study is this working paper by Kelli Bird, Benjamin Castleman, and Gabrielle Lohner (all University of Virginia). They used data from over 295,000 students enrolled in the Virginia Community College System over the five Spring terms from 2016 to 2020 (with the last one being affected by the pandemic). As this is a community college sample, it is older that the sample in the first study, more likely to be working and studying part-time, and has lower high school education performance. However, the results are eerily similar:

The move from in-person to virtual instruction resulted in a 6.7 percentage point decrease in course completion. This translates to a 8.5 percent decrease when compared to the pre-COVID course completion rate for in-person students of 79.4 percent. This decrease in course completion was due to a relative increase in both course withdrawal (5.2 pp) and course failure (1.4 pp). We find very similar point estimates when we estimate models separately for instructors teaching both modalities versus only one modality, suggesting that faculty experience teaching a given course online does not mitigate the negative effects of students abruptly switching to online instruction. The negative impacts are largest for students with lower GPAs or no prior credit accumulation.

Notice that, not only are the effects negative, they are more negative for students with lower GPAs. Again, Bird et al. note that:

One caveat is that VCCS implemented an emergency grading policy during Spring 2020 designed to minimize the negative impact of COVID on student grades; instructors may have been more lenient with their grading. As such, we view these estimates as a lower-bound of the negative impact of the shift to virtual instruction.

The third study is this IZA Discussion Paper by Michael Kofoed (United States Military Academy) and co-authors. The setting for this study is again different, being based on students from the US Military Academy at Westpoint. This provides some advantages though. As Kofoed et al. explain:

Generally, West Point students have little control over their daily academic schedules. This policy did not change during the COVID-19 pandemic. We received permission to use this already existing random assignment to assign students to either an in-person or online class section. In addition, to allow for in-person instruction, each instructor agreed to teach half of their four section teaching load... online and half in-person.

This provides a 'cleaner' experiment for the effect on online learning, because students were randomised to either online or in-person instruction, and almost all instructors taught in both formats, which allows Kofoed et al. to avoid any problems of instructors self-selecting into one mode or the other. However, their sample is more limited in size, to the 551 students enrolled in introductory microeconomics. Based on this sample, they find that: instruction reduced a students final grade by 0.236 standard deviations or around 1.650 percentage points (out of 100). This result corresponds to about one half of a +/- grade. Next to control for differences in instructor talent, attentiveness, or experience, we add instructor fixed effects to our model. This addition reduces the estimated treatment effect to -0.220 standard deviations; a slight decrease in magnitude....

Importantly, the results when disaggregated by student ability are similar to the other studies:

...learning gaps are greater for those students whose high school academic preparation was in the bottom quarter of the distribution. Here, we find that being in an online class section reduced their final grades by 0.267 standard deviations, translating to around 1.869 percentage points of the student’s final grade.

Unlike Altindag et al., Kofoed et al. find that online learning is worse for male students, but there are no significant differences by race. Kofoed et al. also ran a post-term survey to investigate the mechanisms underlying their results. The survey showed that:

...students felt less connected to their instructors and peers and claimed that their instructors cared less about them.

This highlights the importance of social connections within the learning context, regardless of whether learning is online or in-person. Online, those opportunities can easily be lost (which relates back to this post from earlier this month), and it appears that not only does online education reduce the value of the broader education experience, it may reduce the quality of the learning as well.

Kofoed et al. were clearly very concerned about their results, as:

From an ethical perspective, we should note that while it is Academy-wide policy to randomly assign students to classes, we did adjust the final grade of students in online sections according to our findings and prioritized lower [College Entrance Examination Rank] score students for in-person classes during Spring Semester 2021.

Finally, the fourth study is this recent article by Erik Merkus and Felix Schafmeister (both Stockholm School of Economics), published in the journal Economics Letters (open access). The setting for this study is again different, being students enrolled in an international trade course at a Swedish University. The focus is also different - it compares in-person and online tutorials. That is, rather than the entire class being online, each student experienced some of the tutorials online and other tutorials in person, over the course of the semester. As Merkus and Schafmeister explain:

...due to capacity constraints of available lecture rooms, in any given week only two thirds of students were allowed to attend in person, while the remaining third was assigned to follow online. To ensure fair treatment, students could attend the in-class sessions on a rolling basis, with each student attending some tutorials in person and others online. The allocation was done on a first-name basis to limit self-selection of students into online or in-person teaching in specific weeks.

They then link student performance for the 258 students in their sample in the final examination questions with whether the student was assigned to an in-person tutorial for that particular week (they don't compare whether students actually attended or not - this is an 'intent-to-treat' analysis). Unlike the other three studies, Merkus and Schafmeister find that:

...having the tutorial online is associated with a reduction in test scores of around 4% of a standard deviation, but this effect does not reach statistical significance.

That may suggest that it is not all bad news for online learning, but notice that they compare online and in-person tutorials only, while the rest of the course is conducted online. There is no comparison group of students who studied the entire course in person. These results are difficult to reconcile with Kofoed et al., because tutorials should be the most socially-interactive component of classroom learning, so if students feel that the social element is much less (per Kofoed et al.), then why would the effect be negligible (per Merkus and Schafmeister). The setting clearly matters, and perhaps that is enough to explain these differences. However, Merkus and Schafmeister didn't look at heterogeneity by student ability, which I have noted many times before is a problem.

Many universities (including my own) are seizing the opportunity presented by the pandemic to push forward plans to move a much greater share of teaching into online settings. I strongly believe that we need to pause and evaluate before we move too far ahead with those plans. To me, the research is continuing to suggest that, by adopting online learning modes, we create a learning environment that is hostile to disengaged, less-motivated students. You might argue that those are the students we should care least about. However, the real problem is that the online learning environment itself might increase or exacerbate feelings of disengagement (as the Kofoed et al. survey results show). If universities really care about the learning outcomes of students, then we're not at the point where they should be going 'all in' on online education.

Read more:

Sunday, 24 October 2021

COVID-19 risk and compensating differentials in a university setting

A compensating differential is the difference in the wage between a job with desirable non-monetary characteristics and a job with undesirable non-monetary characteristics, holding all other factors (like human capital or skill requirements, experience of the worker, etc.) constant. When a job has attractive non-monetary characteristics (e.g. it is clean, safe, or fun), then more people will be willing to do that job. This leads to a higher supply of labour for that job, which leads to lower equilibrium wages. In contrast, when a job has negative non-monetary characteristics (e.g. it is dirty, dangerous, or boring), then fewer people will be willing to do that job. This leads to a lower supply of labour for that job, which leads to higher equilibrium wages. It is the difference in wages between jobs with attractive non-monetary characteristics and jobs with negative non-monetary characteristics that we refer to as a compensating differential (essentially, workers are being compensated for taking on jobs with negative non-monetary characteristics, through higher wages).

The current pandemic presents a situation where many jobs have suddenly had a new and negative non-monetary characteristic added to them - the risk of becoming infected with the coronavirus. The idea of compensating differentials suggests that workers who suddenly face a job that is riskier than before should receive an increase in wages (and indeed, we have seen that, such as the pay bonus that some supermarket workers have received).

There hasn't been much in the way of systematic research on the compensating differentials arising from the pandemic. No doubt we can expect some in the future. An early example is this new paper by Duha Altindag, Samuel Cole, and Alan Seals (all Auburn University). It turns out that Auburn University didn't strictly follow the CDC requirements for safe social distancing in class, leading to some classes having too many students, and therefore being higher risk. As Altindag et al. note:

Possibly due to the cost concerns, Auburn University did not implement any policy about maintaining six feet of distance between students within the classrooms... Instead, the university set an enrollment limit of half of the normal seating capacity in classrooms, despite the Center for Disease Control (CDC) guidelines and the public health orders of the state... This practice of the university led to about 50% of all face-to-face (F2F) courses in Spring 2021 being delivered in “risky” classrooms, in that the number of enrolled students in classes exceeded their classrooms’ CDC-prescribed safe capacity (the maximum number of students that can be seated in the room while allowing a six-foot distance between all students).

Altindag et al. looked at differences in which staff taught the risky (or 'very risky' - classes where the number of enrolled students was more than double the safe room capacity) classes, and then looked at differences in pay between those teaching risky classes and those teaching less risky classes. For the differences in pay, they are able to adopt an instrumental variables approach, using the presence of fixed furniture in the teaching room as an instrument. As they explain:

Our instrument, Dispersible Class, is an indicator for whether students in a classroom can spread away from each other while attending the lectures. This can only happen in in-person classes that take place in rooms with movable furniture or in online courses in which students have already spread away from one another.

I worry a little about the sensitivity of the results to the inclusion of fully online classes. By construction, the instrument (Dispersible Class) always takes a value of one for online classes, and the online classes are by definition non-risky, so the variation they are picking up is entirely driven by the riskiness of the face-to-face classes. That is what you want in this analysis, but why include the online classes in the analysis at all since they aren't contributing any variation?

Anyway, nit-picking aside, when Altindag et al. look at who teaches the risky classes, they find that:

...GTAs [Graduate Teaching Assistants] and adjunct instructors, who are ranked low within the University hierarchy, are about eight to ten percentage points more likely to teach a risky class compared to the tenured faculty (full and associate professors) and administrators (such as the department chairs, deans, and others) who teach courses in the same department...

...female instructors are more likely to be teaching risky classes. Additionally... younger faculty face higher risk in their classrooms.

The results are similar for 'very risky' classes. Young faculty and low-ranked faculty (and, possibly, female faculty) have less bargaining power with departmental chairs, so are more likely to acquiesce to a request to teach particular classes, and that is what Altindag et al. find. Those academics have consequently taken on more COVID-19 risk. But, are they compensated for this risk? In their instrumental variables analysis, Altindag et al. find that:

...instructors who teach at least one risky class earn 22.5 percent more than their counterparts who deliver only safe course sections... Relative to the average monthly wage of an instructor in our sample, this effect corresponds to approximately $2,100. In a four-month semester, this impact corresponds to $8,400.

Again, the results are similar for 'very risky' classes. So, even though junior faculty and female faculty take on more risky classes, they are compensated for that additional risk. Note that the estimates of the compensating differential control for the instructor's demographic characteristics, academic level, and experience at Auburn. Altindag et al. find that the compensating differential is roughly the same at all academic levels.

One other criticism is that perhaps the types of classes that junior faculty typically teach happen to be those that are riskier. Altindag et al. address this by running their analysis using data on classes from the previous year, when COVID-19 was not a thing. They find no statistically significant differences in who teaches 'hypothetically-riskier' classes, and no statistically significant wage premium for those teaching 'hypothetically-riskier' classes. That provides some confidence that the effects they pick up in their main analysis relate to risk in pandemic times.

This paper raises an interesting question. Faculty are compensated for coronavirus risk, through higher wages. However, it isn't only faculty who face higher risk. Students attending those classes are at higher risk as well. Is there a compensating differential for students, and if so, how would we measure it? That is a question for future research.

[HT: Marginal Revolution]

Read more:

Saturday, 23 October 2021

The drinking age, prohibition, and alcohol-related harm in India

India is an interesting research setting for investigating the effects of policies, because states can have very different policies in place. Consider alcohol: According to Wikipedia, alcohol is banned in the states of Bihar, Gujarat, Mizoram, and Nagaland, as well as most of the union territory of Lakshadweep. In states where alcohol is legal, the minimum legal drinking age (MLDA) varies from 18 years to 25 years. And the laws change relatively frequently. Mizoram banned alcohol most recently in 2019.

Indian states provide a lot of variation to use for testing the effects of alcohol regulation. And that is what this 2019 article by Dara Lee Luca (Mathematica Policy Research), Emily Owens (University of California, Irvine), and Gunjan Sharma (Sacred Heart University), published in the IZA Journal of Development and Migration (open access), takes advantage of. They first collated exhaustive data on alcohol regulations changes at the state level, focusing on prohibition and changes in the MLDA. They note that:

Between 1980 and 2008, the time frame for our analysis, the MLDA ranged from 18 to 25 years across the country, and some states had blanket prohibition policies. In addition, we identified six states that changed their MLDA at least once; Bihar increased its MLDA from 18 to 21 in 1985, and Tamil Nadu repealed prohibition and enacted an 18-year-old MLDA in 1990, then subsequently increased it to 21 in 2005. Andhra Pradesh and Haryana both enacted prohibitionary policies in 1995 (the MLDA in Andhra Pradesh had been 21, and 25 in Haryana) only to later repeal them in 1998 and 1999.

In all, Luca et al. have data on law changes in 18 states over the period from 1980 to 2009, and for 19 states in a more limited number of years. They then look at a number of different outcome variables, drawn from the 1998-1999 and 2005-2006 waves of the National Family Health Survey, as well as crimes and mortality data. They first show that: who are legally allowed to drink are more likely to report drinking, and the relationship is statistically significant. Given that the mean of alcohol consumption for men in the data is approximately 24%, this 5 percentage point change in likelihood of drinking is substantial, representing a 22% increase in the likelihood of drinking.

So, alcohol regulation does affect drinking behaviour (which seems obvious, but is much less obvious for a developing country like India than it would be for most developed countries). Having established that alcohol consumption is related to regulation, Luca et al. then go on to find that:

...husbands who are legally allowed to drink are both substantially more likely to consume alcohol and commit domestic violence against their partners...

...policies restricting alcohol access may have a secondary social benefit of reducing some forms of violence against women, including molestation, sexual harassment, and cruelty by husband and relatives. At the same time, changes in the MLDA do not appear to be associated with reductions in criminal behavior more broadly. We find suggestive evidence that stricter regulation is associated with lower fatalities rates from motor vehicle accidents and alcohol consumption, but also deaths due to consuming spurious liquor (alcohol that is produced illicitly).

In other words, there is evidence that stricter alcohol regulations are associated with lower levels of alcohol related harm, particular domestic violence and violence against women. Now, these results aren't causal although they are consistent with a causal story. Interestingly, Luca et al. choose not to use instrumental variables analysis (which could provide causal evidence), because the regulations proved to only be weak instruments (and they were also worried about violations of the exclusion restriction, because changes in alcohol regulation might have direct impacts on criminal behaviour). Luca et al. still assert that their results 'suggest a causal channel', and to the extent that we accept that, it highlights the importance of alcohol regulation in minimising alcohol-related harm in a developing country context.

Friday, 22 October 2021

The beauty premium at the intersection of race and gender

I've written a lot about the beauty premium in labour markets (see the links at the end of this post), including most recently earlier this week. However, most studies that I am aware of look at the beauty premium for a single ethnic group, or even a single gender, and don't consider that the premium might different systematically between ethnicity-gender groups. So, I was interested to read this recent article by Ellis Monk (Harvard University), Michael Esposito, and Hedwig Lee (both University of Washington, St. Louis), published in the American Journal of Sociology (ungated version here). Their premise is simple (emphasis is theirs):

Given the racialization and gendering of perceived beauty, we should expect such interactions. In short, while Black men may face double jeopardy on the labor market (race and beauty)... Black women may face triple jeopardy (race, gender, and beauty).

Monk et al. use data from the first four waves of the National Longitudinal Study of Adolescent to Adult Health (Add Health), for 6090 White, Black or Hispanic working people who appeared in all four waves of the survey. Like other studies, they measure beauty based on the ratings given by the interviewers, and then they derive an overall beauty score for each research participant. You might be concerned that the race/gender of the interviewer matters. However, Monk et al. note that:

...the vast majority of interviewers (with measured demographic characteristics) are White (just under 70%), and female. Furthermore, the sample of interviewers was highly educated, with 20% having a postgraduate degree, 28% having a college degree, and 31% having some college experience. Again, this interviewer pool represents actors that respondents may typically encounter as gatekeepers in the labor market. This is helpful for the purposes of our study...

They find few differences in attractiveness ratings given by different race/gender interviewers, although:

Black female interviewers appeared to give slightly lower ratings overall than White women (except for when evaluating Hispanic respondents)... Black male interviewers tended to give lower scores to male respondents regardless of their race/ethnicity.

Monk et al. don't feel a need to condition their results on the race/gender of the interviewers, but since they are evaluating beauty based on four different interviewers' ratings, it probably isn't a big deal.

Anyway, onto the results, which can be neatly summarised by their Figure 2:

Looking at the figure, they find that there is a beauty premium for all six race/gender groups, but the beauty premium differs among those groups. In particular, the beauty premium is largest for Black men and women. However, expressing it that way doesn't quite capture what is going on. It isn't so much a larger positive beauty premium, as a larger penalty for unattractiveness. Notice that the disparity in income between Blacks and Whites is much smaller for the most attractive people than for the least attractive people. In fact, the incomes of the most attractive Black women are higher on average than the incomes of the most attractive White women (controlling for age, education, marital status, and other characteristics). The differences are quite substantial. Monk et al. note that (emphasis is theirs):

White males with very low levels of perceived attractiveness are estimated to earn 88 cents to every dollar likely to be paid to White males who are perceived to possess very high levels of attractiveness. This is similar in magnitude to the canonical Black-White race gap, wherein using the same set of controls we find that a Black person earns 87 cents to every dollar a white person makes.

Sizable income disparities are observed among subjects judged to be least and most physically attractive in each other subpopulation analyzed as well. The ratio of predicted earnings of individuals at the 5th percentile of perceived attractiveness compared to individuals at the 95th percentile of perceived attractiveness is 0.83 among White females; 0.78 among Hispanic males; and 0.80 among Hispanic females. Again, note that returns to attractiveness are most pronounced among Black respondents: Black females at the 5th percentile of attractiveness ratings are estimated to earn 63 cents to every dollar of Black females at the 95th percentile of attractiveness. Black males at the 5th percentile of attractiveness are expected to earn 61 cents to every dollar earned by Black males at the 95th percentile of attractiveness.

 Clearly, these results are important for the future measurement and understanding of the beauty premium in labour markets. As Monk et al. note, they also contradict Daniel Hamermesh's speculative comment in his book Beauty Pays (which I reviewed here) that:

 ...the effects of beauty within the African-American population might be smaller [than among whites]...

Monk et al. conclude that:

...perceived physical attractiveness is a powerful, yet often sociopolitically neglected and underappreciated dimension of social difference and inequality regardless of race and gender. Further still, its consequences are intersectional...

We should be accounting for that intersectionality in future studies of the beauty premium.

[HT: Marginal Revolution]

Read more:

Monday, 18 October 2021

More evidence on the blond wage premium

Last year, I wrote a post about the wage premium that blondes receive in the labour market. That post was based on this 2012 article by Nicolas Guéguen. It turns out that there is more evidence of the blond wage premium. I recently read this 2010 article by David Johnston (Queensland University of Technology), published in the journal Economics Letters (sorry, I don't see an ungated version online). Whereas Guéguen used data from a field experiment, Johnston relied on panel data from the National Longitudinal Study of Youth 1979 cohort. Based on a sample of over 20,000 observations, Johnston finds that:

...blonde women earn 7% more than brunette women (the omitted hair colour category).

Interestingly, there are no statistically significant difference in wages between brunettes and women with other hair colours (light brown, black, or red). Also interesting, but not commented on by Johnston, is that green-eyed women earn nearly seven percent more than brown-eyed women (and there is no statistically significant difference in wages between women with blue or hazel eyes, and women with brown eyes). Also interesting is that, when looking at women's spouses:

Spouses of blonde women are estimated to earn around 6% more than the spouses of other women.

That probably reflects assortative matching in the marriage market (more-educated people tend to marry other people with high education). The effect of green eyes is not statistically significant for spousal wages.

Of course, the NLSY data are observational (and self-reported), so any results are correlations rather than causal. Since hair colour is a modifiable characteristic, it could be that women with higher wages are more likely to dye their hair blond (or women with lower wages are more likely to dye their hair darker colours). Or any number of other possibilities. And, Johnston's analysis is limited to "Caucasian women" - this is a point I will return to on my next post on the beauty premium.

As I've noted before (see links below), the beauty premium is a labour market phenomenon for which there is a large (and growing) amount of evidence. Physical features do appear to be correlated with earnings. Blondes do appear to have more fun.

Read more:

Sunday, 17 October 2021

Was Julian Simon just lucky in the Simon-Ehrlich bet?

You may have heard of the famous bet between University of Maryland professor of business administration Julian Simon and Stanford University biology professor (and author of the influential book The Population Bomb) Paul Ehrlich. Ehrlich had argued that resources were becoming scarcer. Simon pointed out that, if increased scarcity were true, then the price of resources would be increasing. He challenged Ehrlich to choose any raw material, and a date more than a year away, and Simon would bet that the price would decrease over that time rather than increasing.

Ehrlich accepted the bet, choosing copper, chromium, nickel, tin, and tungsten as the raw materials on which the bet would be based. The bet was formalised on 29 September 1980, and was evaluated ten years later. The price of all five materials fell between 1980 and 1990, and Simon won the bet.

However, was he just lucky? This 2010 article by Katherine Kiel, Victor Matheson, and Kevin Golembiewski (all College of the Holy Cross), published in the journal Ecological Economics (ungated earlier version here), argues that he was. Kiel et al. look at all of the possible ten-year periods from 1900 to 2008 (there are 99 decades from 1900-1910, to 1998-2008), and find that Ehrlich would have won the bet in 61.6 percent of decades. Simon was lucky that the decade the bet was placed in, starting in 1980, was one where resource prices generally declined (and having a large recession just at the time when the bet was being evaluated probably helped).

Kiel et al. evaluated the bet using annual data. It would be interesting to see how the bet would fare using higher-frequency data. How sensitive was the outcome of the bet to the recessionary state of the economy at the end of the bet period, for example? By taking such a long time period into account, Kiel et al. ignore the endogeneity - if the bet had been placed in 1900, it's unlikely that Ehrlich would have chosen those five raw materials - he may well have chosen five completely different raw materials.

The Simon-Ehrlich bet was important in demonstrating that the neo-Malthusian view of the world is overly pessimistic. However, we need to be careful not to overstate its significance. Julian Simon got a little bit lucky.

Saturday, 16 October 2021

Book review: People, Power, and Profits

Last month, I reviewed Joseph Stiglitz's book Globalization and Its Discontents. This month, I fast-forwarded to reading his most recent book People, Power, and Profits. This book builds on Stiglitz's previous books on globalisation, inequality, economic policy and finance, and proposes and agenda to reclaim capitalism and rebuild the middle class in America. And this really is a book about America, and the failings of its democracy, its recent decades of deregulation, and its engagement with globalisation. There are insights that other countries can draw on (I'll come to an example a bit later), but the readers who will gain the most from this book are readers with an understanding of the US experience.

The book was published in 2020, and written while President Trump was still in power, and Stiglitz's anxiety about the potential for a second term Trump government is plain. However, he is clearly looking beyond that, and the book has an undertone that is more hopeful than you might expect. Nevertheless, Stiglitz doesn't pull any punches in considering where things went wrong:

...we got the economics, the politics, and the values wrong.

We got the economics wrong: we thought unfettered markets - including lower taxes and deregulation - were the solution to every economic problem; we thought finance and globalization and advances in technology would, on their own, bring prosperity to all. We thought that markets were, on their own, always competitive - and so we didn't understand the dangers of market power...

We got our politics wrong: too many thought that just having elections was all that democracy was about. We didn't understand the dangers of money in politics, its power...

We got out values wrong. We forgot that the economy is supposed to serve our citizens, not the other way around.

A lot of the book focuses on power (it's in the title, after all): political power, but especially market power. Rent seeking and regulatory capture, concepts that would hopefully be quite familiar to my ECONS102 class, also make a frequent appearance. In particular, the intersection of market power and financial deregulation is implicated as particularly problematic:

The most important failing of the banks, however, was not the multiple ways they cheated and exploited others or the excessive risk taking that brought the global economy to its knees, but their failure to do what they were supposed to do - provide finance, at reasonable terms, for businesses as they sought to make investments that would allow the economy to grow.

His solution for the banks' failings involves re-focusing it away from short-term profits, and undermining the role of market power:

By circumscribing the riskier and more abusive ways the financial sector makes profits, we will encourage it to do more of what it should be doing. But that won't be enough. We also need to make the financial sector more competitive.

Stiglitz's solutions to the US problems are far-reaching, including increased investment in education, research and development, reforming social protection and the welfare state, health care, home ownership, and a greater focus on social justice. Not all of the policy prescription is appropriate for other countries, where social security and health care in particular, already largely follow Stiglitz's prescriptions.

However, one aspect that I think New Zealand could consider is his prescription on home ownership. Stiglitz advocates for government provision of mortgage finance. The government could provide mortgages at lower interest rates than the private sector (because of the government's much lower borrowing cost), and administer repayments through the tax system (lowering transaction costs). Stiglitz also notes that such a system could more easily deal with contingencies (such as unemployment, or reduced income), which would reduce the number of foreclosures. He doesn't go quite this far, but it might be possible to make mortgage payments income-contingent (like the students loans system is currently). This strikes me as something that New Zealand could consider as a complement to social housing, helping more low-income and young people onto the property ladder, and freeing up the inadequate quantity of social housing for those who are truly in need.

I quite enjoyed this book. However, I wonder how much of my enjoyment was underpinned by having some understanding of the current policy settings in the US. Readers who do not follow US politics and policy as closely may well find some of the current (and past) politics a little bewildering. This isn't to suggest that those readers won't similarly enjoy the book - they'll probably just encounter a number of 'what the...?' moments. US politics is crazy, but it's important to understand, because, as Stiglitz notes several times in the book:

The difficulty is not the economics, but the politics.


Thursday, 14 October 2021

Surprise! Non-drinkers support more restrictions on alcohol, but minimum unit pricing might be a hard sell

I've been meaning to read the results from the Alcohol Use in New Zealand study for some time, given that they were released near the start of the year. In particular, I was interested in this bit on public attitudes on policy interventions. The report is brief (four pages), and reports on results from a representative survey of over 4500 New Zealand adults undertaken last year (the methodology is described in a separate report).

Anyway, the results are interesting because they ask to what extent people support various interventions drawn from the WHO's SAFER initiative:

  • (S) Strengthen restrictions on alcohol availability;
  • (A) Advance and enforce drink driving countermeasures;
  • (F) Facilitate access to screening, brief interventions and treatment;
  • (E) Enforce bans or comprehensive restrictions on alcohol advertising, sponsorship, and promotion; and
  • (R) Raise prices on alcohol through excise taxes and pricing policies.
In relation to the first intervention, they found that:

Four in five (83%) respondents supported tightening restrictions on drink driving by making the penalties harsher. Women... were more likely... to show support.

Those who didn’t drink in the last week... were also more likely to show support to this policy.

However, there was lower support (48%) for changing the blood alcohol limit when driving to zero. More likely to show support were women...

Non-drinkers... were also more likely to express support.

For the second intervention:

Three-quarters (76%) of respondents supported banning the promotion of alcohol from social media that under 18-year-olds use...

Women... were more likely to express support, along with non-drinkers...

Three in five (62%) respondents supported banning alcohol sponsorship at sporting, community and other events that under 18-year-olds go to...

More likely to show support were women... Non-drinkers... were also more likely to express support.

Are you beginning to see a pattern? For the other three interventions:

Three in five (60%) respondents supported requiring health professionals to regularly ask patients about their drinking.

Women... and non-drinkers were more likely to express support. 

Just over half (54%) of respondents supported having fewer places selling alcohol in the local community.

More likely to express support were women... and non-drinkers.

One in three (33%) respondents supported raising the minimum price of alcohol.

More likely to show support were women...

Non-drinkers... were also more likely to support the policy.

In fact, it appears that for every single policy intervention that people were asked about, non-drinkers were more likely to support those interventions (and so were women). In terms of non-drinkers, it's pretty easy to support a policy when you expect to face none of the costs of the policy once it's implemented. It would have been interesting to see how much support there was among drinkers for these policy interventions, although I expect that you then get exactly the opposite problem (since drinkers face all of the costs of the policy)!

Putting aside the difference in preferences between drinkers and non-drinkers (and men and women) for the moment, the relative ranking of the various interventions is interesting. I don't know whether the differences in ranking are statistically significant, but taking the headline results, the interventions can be ranked in terms of public preferences:

  1. Stricter penalties for drink driving (83%)
  2. Banning the promotion of alcohol from social media (76%)
  3. Banning alcohol sponsorship at events (62%)
  4. Requiring health professionals to ask about drinking (60%)
  5. Having fewer places selling alcohol (54%)
  6. Lowering the blood alcohol limit for driving to zero (48%)
  7. Raising the minimum price of alcohol (33%)
I don't know about the feasibility of doing #2, but the others with more than 50% support are certainly feasible (as are the other two interventions with less than majority support). It would also be interesting to see if the relative ranking was the same for both drinkers and non-drinkers.

The distinct lack of support for raising the minimum price is interesting as well, given that it has been a policy favoured in some other countries, like Scotland. Wales implemented a minimum price last year, and the Republic of Ireland will introduce minimum pricing from 1 January 2022. Economists aren't typically in favour of price controls, but in this case minimum pricing may actually be more efficient. As Paul Calcott (Victoria University of Wellington) showed in this 2019 article published in the Journal of Health Economics (sorry, I don't see an ungated version online), the combination of minimum unit pricing with an excise tax may be optimal:

...when relatively cheap forms of alcohol are undertaxed, and the quality and quantity of alcohol are substitutes.

The numerical example that Calcott provides suggests that heavier drinkers may be undertaxed (while lighter drinkers might be undertaxed), because heavier drinkers pay no more per ounce of alcohol than lighter drinkers do. The article is quite mathematical, so not for the faint of heart. My takeaway from Calcott's article was that there was modest support for minimum unit pricing, since it would affect the heaviest drinkers the most. However, clearly there is little support from the public for this intervention.

The question now, is whether the government has any appetite to revise alcohol laws and implement any of the policy interventions that do appear to have majority support.

Wednesday, 13 October 2021

The value of an in-person university education

Last year, I posted about this this research on the willingness-to-pay for studying in person. That research was based on a survey of 46 Columbia University public health students. The researchers asked students how much they would be willing to pay in a straightforward way that is open to substantial bias, whereas a better approach would present students with hypothetical scenarios and derive their willingness-to-pay from their choices between scenarios (using either a contingent valuation approach or a discrete choice experiment). I concluded that post with:

Hopefully, someone else is doing research along those lines.

It turns out there was, and the results are reported in this NBER Working Paper by Esteban Aucejo, Jacob French (both Arizona State University), and Basit Zafar (University of Michigan). Specifically, they surveyed over 1500 students at Arizona State University, asking (among other things) how likely they would be to re-enrol in the Fall 2020 semester, under different conditions and at different costs. Importantly, the conditions included: (1) whether the pandemic continued, or was controlled; (2) whether classes would be in person, or remote; and (3) whether campus life and activities were restricted, or could continue as before. Each survey respondent was presented with six scenarios (with combinations of the conditions) at seven different levels of cost. That allowed Aucejo et al. to extract estimates of the willingness-to-pay (WTP) for in-person classes, and for access to the usual campus life and activities. For the sample as a whole, they found that:

...students are willing to pay $1,043 (approximately 8.1% of average annual cost) to have access to [campus social life]... students are willing to pay $547 more per year in order to have in-person classes (relative to remote classes); this represents 4.2% of average annual cost of attending university, and approximately half the WTP for social activities.

So, there is a small, but statistically significant willingness to pay for campus social activities, and for in-person classes. Students are willing to pay more for social activities than for in-person classes. Some might argue that's because Arizona State University has a reputation as a party school, although it is now recognised for its investment in research. More likely, the higher WTP for social activities reflects the availability of substitutes. Online classes are a viable substitute for in-person classes, but online social activities are not much of a substitute for on-campus social activities.

Things get even more interesting when Aucejo et al. look at heterogeneity across the student sample. This is illustrated in Figure 5 from the paper, which plots the cumulative density functions for WTP for social activities and in-person classes:

Concentrating on the WTP for in-person classes, about one-third of students are willing to pay a negative amount for in-person classes. Those students prefer to study online. There is then about 20 percent of students who have WTP of roughly zero, and about half of students have positive WTP for in-person classes. Those students prefer not to study online. The top half of the distribution (positive WTP) is similar for both in-person classes and social activities. There are far fewer students with negative WTP for social activities than there is for in-person classes.

Aucejo et al. then look at WTP across groups, and find that:

...first-generation students' average WTP for in-person classes is only $204 per year, while second-generation students (that is, students with at least one college-educated parent) have an average WTP of $550... First-generation students also appear less willing to pay for campus social activities (on average, $547 per year versus $1,126 for second-generation)... Similar patterns emerge across a number of socioeconomic divides; for example, nonwhite and non-Honors students appear less willing to pay for in-person instruction and social activities, respectively.

Looking closely at their analysis of WTP for social activities, lower-income and first generation students have lower WTP for social activities, but those effects disappear once Aucejo et al. control for hours worked. Students who work more than 20 hours per week (who also happen to be disproportionately lower-income and first generation students) have lower WTP for social activities. It is likely that reflects a time constraint. In relation to WTP for in-person classes, lower-income students have lower WTP, and that effect is persistent even after controlling for other characteristics (including working). None of the other socio-economic characteristics they test are associated with WTP for in-person classes.

There are a number of things we can take away from this paper, in terms of what it implies about the post-pandemic university teaching environment. First, and most importantly, students are heterogeneous in their preferences (this should not come as a surprise). Some students prefer in-person classes, while other students prefer online classes. A one-size-fits-all approach to university education as we come out of the pandemic is clearly not going to be optimal for all students. However, trying to cater to both groups simultaneously (by providing classes that are at once both in-person and online) is not optimal either. This flexible approach (that many universities are currently adopting) increases lecturer workload. Consequently, it likely reduces the quality of teaching they can offer, both to online and in-person students, compared with teaching in a single mode (either in-person or online). That suggests to me that specialisation is going to out-perform the flexible model. Universities that specialise in in-person teaching will do a better job of it than the flexible university, and will be more attractive to students who prefer in-person learning. Universities that specialise in online teaching will do a better job of it than the flexible university, and will be more attractive to students who prefer online learning. The flexible university is stuck in the middle, not catering adequately to either group of students, despite frantically trying to provide for both.

Second, the value of the social interactions that students have on-campus is large and important. This value is mostly lost in the online model. If universities are committed to a flexible approach (despite the problems I just noted), or are providing an online-only model of education, then finding some way of replicating the social activities for remote students is a must. Universities that can do this well will provide significant value to their students. However, time constraints are binding on social activities for lower-income and first generation students. Finding ways to ensure that these students are able to engage in the important social activities that generate lasting social networks or peers and future colleagues, partners, and collaborators is going to be important, both for in-person and online students.

Overall, I note that more students demonstrated a preference for the in-person teaching model. That fits nicely with my priors. It would be interesting to know whether that result holds across more institutions than just Arizona State University.

[HT: Marginal Revolution, back in March]

Read more:

Tuesday, 12 October 2021

Nobel Prize for David Card, Joshua Angrist, and Guido Imbens

I was excited to hear that the 2021 Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel (aka Nobel Prize in Economics) was announced as being awarded to David Card (University of California, Berkeley), Joshua Angrist (MIT), and Guido Imbens (Stanford University). David Card's half of the award was "for his empirical contributions to labour economics", while Angrist and Imbens shared the other half "for their methodological contributions to the analysis of causal relationships."

This was a well-deserved prize, as all three have made outstanding contributions to empirical microeconomics, and in particular they laid the foundations for what has been termed the 'credibility revolution' in economics. The revolution was the turn towards a greater focus on empirical work that began from the 1990s, and particularly involved the adoption of new econometric methods that now encompass instrumental variables regression, difference-in-differences, and regression discontinuity designs, among others. Regular readers of this blog will certainly have come across those terms before, as they are now the de facto standard for much of the empirical work in applied microeconomics.

The focus on applications is perhaps the most important aspect of their work. Rather than making contributions purely to econometric theory, all three have written key papers that apply those techniques to important real-world policy problems. Alex Tabarrok at Marginal Revolution has written an excellent (albeit brief) summary of their contributions, as has The Economist. You can read the Nobel Committee's summary of their work here.

The work of all three awardees has appeared at various times on this blog, including a reference to David Card's work (with Alan Krueger) on the minimum wage just a few days ago (albeit noting that the literature as a whole probably doesn't support the Card and Krueger results now), Card's work (with co-authors) on the gender gap at top economics journals, reviews of Joshua Angrist's popular econometrics books (with  Jorn-Steffen Pischke) Mostly Harmless Econometrics (review here) and Mastering 'Metrics (review here), and Guido Imbens on statistical significance back in August.

One sad aspect of this award is that Alan Krueger, who might surely have shared the award given his early work with both David Card and Joshua Angrist, passed away in 2019. Imbens may well have been the beneficiary, but nevertheless he, along with Card and Angrist, is very deserving of recognition. 

Anyway, I'm quite overexcited by this award - I appear to have run out of superlatives to use in this post!

Sunday, 10 October 2021

Sex ratios, bargaining power, and mate preferences

Last year, I wrote a post critiquing some research on 'sexual economics', because they used the supply and demand model as an underlying theory. A better underlying theory is based on a search model (see this post on the economics of sex robots, for example). Usually, we apply the search model to the labour market, but as I note in my ECONS101 class, it can also be used to describe matching in marriage markets, or in the markets for 'shorter-term relationships'.

The simple explanation in the relationship market works like this. Each matching of a couple [*] to each other creates a surplus that is shared between both of them. Because relationship matching creates a surplus, this provides each partner with a small amount of market power (or bargaining power). That is because if one of them rejects the opportunity for the relationship, the other has to start looking for someone else. Each partner is somewhat reluctant to start their search over, so each partner can use that to their advantage. The division of the surplus created by the match will depend on the relative bargaining power each partner. Whichever partner has more bargaining power will get a better deal.

That brings me to this recent article by Kathryn Walter (University of California, Santa Barbara) and 107 (!) other co-authors (spread across 76 institutions!), published in the journal Proceedings of the Royal Society B (open access). Walter et al. reported on a large cross-country sample of over 14,000 survey respondents from 45 countries, and investigated the relationship between sex ratios (the number of males per 100 females) and mate preferences. Specifically:

Participants completed a 5-item questionnaire on ideal mate preferences for a long-term romantic partner. Participants rated their ideal romantic partner on five traits: kindness, intelligence, health, physical attractiveness and good financial prospects. All items were rated on bipolar adjective scales ranging from 1 (very unintelligent; very unkind; very unhealthy; very physically unattractive; very poor financial prospects) to 7 (very intelligent; very kind; very healthy, very physically attractive; very good financial prospects). Using the same scales as for preferences, participants additionally rated themselves on the same five traits: kindness, intelligence, health, physical attractiveness and good financial prospects.

They used these data to construct two measures of preferences for each of the five items: (1) an absolute mate preference (which is the research participant's preference rated from one to seven); and (2) a relative mate preference (which is the research participant's preference, relative to the average of all people in the sample from their city/country, measured as a z-score). The rationale for the relative measure is: account for the fact that the same absolute preferred trait value may be more or less demanding depending on the availability of that trait in the local population.

They also used various alternative measures of the sex ratio, based on the inclusion of different age groups. Using a multi-level model and controlling for various socio-economic differences between countries, Walter et al. find that for relative preferences:

The interaction between sex and sex ratio predicted relative preference for good financial prospects and relative preference for physical attractiveness for every measure of sex ratio...

In general, as men became more numerous, men, compared to women, decreased their relative preferences for good financial prospects, whereas women, compared to men, tended to increase their relative preferences for good financial prospects...

In general... as men became more numerous, men decreased their relative preference for physical attractiveness, whereas women tended to increase their relative preference for physical attractiveness...

The interpretation of these results isn't quite correct, because men weren't becoming more (or less) numerous, because the data were cross-sectional. Instead, we should read it as saying "Ceteris paribus, in cities or countries where men are more numerous..." However, the takeaway message is clear, and consistent with a search model of the relationship market, where bargaining power matters. In places where the sex ratio is skewed more towards men, men have less bargaining power in the relationship market, and women can afford to be choosier, preferring men with better financial prospects and men who are better-looking. The opposite happens for men's preferences - they must be less choosy and can't afford to limit themselves to women with better financial prospects or better-looking women. In contrast, in places where the sex ratio is skewed more towards women, men have more bargaining power in the relationship market, and men can afford to be choosier, preferring women with better financial prospects and women who are better-looking. The opposite happens for women's preferences - they must be less choosy and can't afford to limit themselves to men with better financial prospects or better-looking men.

It is interesting that bargaining power doesn't appear to affect preferences for intelligence, kindness, or health. Walter et al. suggest that this may be because "they are so highly desired, and therefore more invariant". The mean preferences for those attributes were so high, that people are not willing to trade them off at the observed levels of the sex ratio. Perhaps they would if the sex ratio was extremely skewed, but that isn't what Walter et al. observe.

Overall, these results demonstrate that bargaining power matters, not just in the labour market, but also in other situations where matching is a feature.

[HT: The Dangerous Economist]


[*] This same explanation could easily be extended to polyamorous relationships.

Saturday, 9 October 2021

More empirical support for the disemployment effects of the minimum wage

Three new papers I recently read, taking different approaches, have all come out supporting the disemployment effects of the minimum wage. The first is this NBER Working Paper (alternatively, here) by David Neumark (University of California, Irvine) and Peter Shirley (West Virginia Legislature). They first take issue with the way that the body of literature on the minimum wage is summarised in existing research, noting that:

Most academic writing on the U.S. research evidence gives the impression that even where there is some evidence of negative employment effects, we really cannot reach any conclusions or consensus...

The absence of complete agreement across a large set of studies is not surprising. It is also not surprising that advocates on one side or the other emphasize different studies – in particular, the ones more consistent with their policy positions. What is surprising, though, is the absence of agreement on what the research literature says – that is, how economists even summarize what the body of evidence says about the employment effects of minimum wages. Depending on what one reads about how economists summarize the evidence, one might conclude that: (1) it is now well-established that higher minimum wages do not reduce employment, (2) the evidence is very mixed with effects centered on zero with no basis for a strong conclusion one way or the other, or (3) most evidence points to adverse employment effects... 

The disparate conclusions has been a feature of the literature at least since the groundbreaking Card and Krueger study (ungated) in 1994 that suggested the minimum wage had no disemployment effects. Neumark and Shirley then construct a database of all 69 papers presenting estimates of the employment effects of the minimum wage in the U.S., since the New Minimum Wage Research Symposium in 1992. They avoid taking a meta-analysis approach (a point I will come back to), instead simply summarising the point estimates of the 'preferred estimate' from each of the studies in their sample. They argue in favour of this approach because:

...we think that using the entire set of estimates reported is likely to fail to convey the conclusions of the research, most importantly because many papers present estimates that the authors do not view as credible.

In other words, because most studies present many estimates in order to demonstrate their robustness to different assumptions or subsamples, or in order to demonstrate the effects based on assumptions used in other studies, Neumark and Shirley feel justified in selecting a single 'preferred' estimate from each study. Before you get carried away though, they didn't cherry pick the 'preferred' estimate - they used the estimate that was referred to by the authors as preferred, was highlighted in the conclusions to the paper, and for papers where it wasn't obvious, Neumark and Shirley contacted the authors.

It turns out that most studies find negative employment effects of the minimum wage. Neumark and Shirley draw the following conclusions:

  • There is a clear preponderance of negative estimates in the literature. In our data, 78.9% of the estimated employment elasticities are negative, 53.9% are negative and significant at the 10% level or better, and 46.1% are negative and significant at the 5% level or better.
  • This evidence of negative employment effects is stronger for teens and young adults, and more so for the less-educated.
  • The evidence from studies of directly-affected workers points even more strongly to negative employment effects.
  • The evidence from studies of low-wage industries is less one-sided, with 66.7% of the estimated employment elasticities negative, but only 33.3% negative and significant at the 10% level or better, and the same percent negative and significant at the 5% level or better. We suggest, however, that the evidence from low-wage industries is less informative about the effects of minimum wages on the employment of low-skill, low-wage workers.

From my perspective, the problem with this study is that it isn't a meta-analysis, despite Neumark and Shirley's arguments to the contrary. Neumark and Shirley are known for their studies finding in favour of disemployment effects of the minimum wage, so many readers would not be surprised by their conclusions. What we need is a more objective analysis.

This 2019 meta-analysis by Paul Wolfson (Dartmouth College) and Dale Belman (Michigan State University), published in the journal Labour (ungated version here) avoids the same accusations of bias, because Wolfson and Belman don't have the same history of being on one side of the debate. Their meta-analysis covers 37 studies (with 739 estimates) of the employment effects of the minimum wage in the U.S., published over the period from 2000 to 2015. They find that:

...the range of the employment elasticity has shifted toward zero since Brown et al. (1982), from [-0.3, -0.1] to [-0.13, -0.07], but, in contrast to prior meta-analyses Doucouliagos and Stanley (2009), Belman and Wolfson (2014), our estimates are negative and statistically significant, albeit of small magnitude.

The employment elasticity of the minimum wage is the percentage change in employment that results from a one percent increase in the minimum wage. The range of [-0.13, -0.07] suggests that a one percent increase in the minimum wage would reduce employment by between 0.07 and 0.13 percent. It's not an incredibly large effect, but it is negative and statistically significant, suggesting that across all of the studies in this meta-analysis considered jointly, increases in the minimum wage do reduce employment overall. However, Wolfson and Belman also note that:

Teenagers, and eating and drinking establishments together account for more than half of the estimates in our sample. Estimating separate models for teens and for eating and drinking places has little effect on our estimated range, moving it from [-0.13, -0.10] to [-0.11, -0.07]... The minimum wage then has negative employment effects, but estimates of them have become smaller and are largely localized to teenagers, who comprise a declining share of the labor force.

Finally, a common critique of the minimum wage literature (from both sides) is that the analyses are conducted in a way that increases the chances of concluding one way or the other. One way to avoid that is to adopt a pre-analysis plan (see this post for more). This is uncommon in research in non-experimental settings, but is the approach adopted in this NBER Working Paper (alternatively, here) by Jeffrey Clemens (University of California, San Diego) and Michael Strain (American Enterprise Institute). After two initial papers they wrote in 2017 and 2018 (using U.S. state-level data from 2011 to 2015), Clemens and Strain pre-committed to a particular analysis to be conducted using data through to 2019. This has been a period of significant change in minimum wages in the U.S., as they note:

After the Great Recession, there was a pause in both state and federal efforts to increase minimum wages. This pause created a baseline (or “pre-period”) for empirical purposes. It was followed by considerable divergence in states’ minimum wage policies. A number of states legislated and began to enact minimum wage changes that varied substantially in their magnitude. From January 2011 to January 2019, for example, Washington, D.C., California, and New York had increased their minimum wages by 61, 50, and 53 percent, respectively. Wage floors rose more moderately in an additional 24 states and were unchanged in the remainder.

Clemens and Strain differentiate between four 'policy groups' of states in their analysis:

The first group consists of states that enacted no minimum wage changes between January 2013 and the later years of our sample. The second group consists of states that enacted minimum wage changes due to prior legislation that calls for indexing the minimum wage for inflation. The third and fourth groups consist of states that have enacted minimum wage changes through relatively recent legislation. We divide the latter set of states into two groups based on the size of their minimum wage changes and based on how early in our sample they passed the underlying legislation.

 Using data from the Current Population Survey and the American Community Survey, they find that:

...over the short and medium run, relatively large increases in minimum wages have reduced employment rates among individuals with low levels of experience and education by just over 2.5 percentage points. Second, our estimates of the effects of relatively small minimum wage increases are variable and centered on zero, as are our estimates of the effects of minimum wage increases linked to inflation-indexing provisions. Finally, our results provide evidence that the medium-run effects of large minimum wage changes are larger and more negative than their short-run effects.

All of that seems consistent with the other two papers - there are disemployment effects of the minimum wage. However, there is a note of caution about the Clemens and Strain results:

The minimum wage increases we analyze are thus much larger than the typical increases analyzed in previous research. Indeed, the average increase in our set of “large” increases is over four times as large, in percent terms, as the average size of the 138 increases analyzed by Cengiz et al. (2019).

Large changes in the minimum wage can be expected to have large disemployment effects. That should be a worry in the U.S. context, where there are policy proposals to substantially raise the minimum wage in some states.

Overall, these three papers paint a picture that, to me, is consistent with much of the highest quality research on the minimum wage in recent years. Raising the minimum wage does reduce employment, and that reduction in employment is concentrated among young and less educated workers. Policy proposals to raise the minimum wage (or, to implement a living wage) need to consider how to best ameliorate these negative effects.

[HT: For the Neumark and Shirley paper, Marginal Revolution]

Read more: