Sex, Drugs and Economics: January 2020

Thursday, 30 January 2020

The beauty and height premiums in the labour market

I've written a few times about the beauty premium in labour markets - more attractive people do better in school and earn more in the workforce (see here and here, as well as here for my review of Daniel Hamermesh's excellent book Beauty Pays). Attractiveness is not the only physical attribute that attracts a premium. Taller people also tend to get a wage premium.

So, I was interested to read this 2008 article by Hung-Lin Tao (Soochow University), published in the journal Kyklos (ungated). Tao used data from 6,452 Taiwanese female college students in 2003, who were surveyed again in 2004-2005 about their earnings (among other things). The sample was limited to female students because compulsory military service in Taiwan meant that most of the men in the sample were not working at the time of the survey. Respondents to the survey were asked about their height, weight, and self-perceived attractiveness ('satisfied' or 'unsatisfied' with their looks).

Looking at employment status, Tao found that people who were more satisfied with their looks were significantly more likely to be in graduate study or in full-time work (compared with being unemployed), and taller people were significantly more likely to be in graduate study than unemployed. There were no significant effects for weight (relative to the unemployed). In terms of earnings, they found that:

...a 1% increase in height leads to a 0.41% increase in the entry wage. College graduates who perceive themselves as good-looking earn about 3.4% more than those who perceive themselves as plain-looking. The coefficients of the BMI logarithm and its square are positive and negative, respectively, and are significant at the 10% level. These two BMI coefficients imply that the optimal BMI is about 20.09.

The term "leads to" implies causality, but there is no causal interpretation here - this is merely a correlation. Taller people, and those who are more satisfied with their looks, earn more than shorter people and those who are less satisfied with their looks.

Tao then goes on to compare the size of these effects with corresponding effects based on the quality of education (as proxied by the type of college the person went to), and finds that the effect of schooling is much larger. I find these results less compelling though, because it is hard to compare the effect of being 1% (or even 10%) taller, with the effect of going to a private college rather than a public college. What is the meaningful comparison here? Both effects are statistically significant, and that is what matters.

Setting aside the self-reported nature of attractiveness, it is interesting that both height and attractiveness are both statistically significant and positively correlated with earnings. Clearly, they are not both picking up the same effect. It would be interesting to see some further analyses on other datasets, which include objective (rather than self-reported) measures of both height and attractiveness, and see whether this holds up to further scrutiny.

Read more:

Wednesday, 29 January 2020

The division of Hamilton into electorates

I was interviewed for the Waikato Times yesterday about Hamilton electoral boundaries, and the story appeared this morning:

Hamilton residents could find themselves living in two completely new electorates under a bold new proposal to completely redraw the city's political landscape.

The existing Hamilton East and Hamilton West electorates would be scrapped and replaced with two new Hamilton North and Hamilton South electorates, if the suggestion by political pundit and former Green party candidate Mark Servian is adopted by the Representation Commission.

Servian reckons electoral boundaries based on Hamilton's northern and southern halves would better reflect the city's diverse and changing population and its "communities of interest"...

Michael Cameron, an associate professor in economics at Waikato University said the Servian proposal was "probably a non-starter" because it potentially clashed with the commission's objectives of having an equal number of electors in each electorate.

The demographics of each electorate were not a factor the organisation took into consideration. However topographical features were - and the river was a prominent and natural dividing line.

"The other thing the commission doesn't want to do is create any unnecessary confusion. Everyone in Hamilton would effectively be in a new electorate under this proposal, so it would probably not be acceptable to them on that basis."

Cameron said it was difficult to declare whether diverse or homogenous electorates were better.

"That's something for the commission to ponder, and that's why we have them to make those sort of decisions.

"We definitely don't want to end up in a situation where the politicians get to decide their own boundaries, like they do in the United States."

This is not the first time I've been asked to comment on electoral boundaries recently (see my earlier post here).

The Representation Commission makes decisions on New Zealand's electoral boundaries. You can read their 2019/20 report here. It recommends only one change relevant to Hamilton, which is to move the area around Horsham Downs from the Hamilton West electorate to the Waikato electorate. The Hamilton East electorate would remain unchanged.

I described the change to Hamilton North and Hamilton South as a non-starter, for two reasons. First, it would mean that every person in Hamilton would be in a new electorate. The Representation Commission is required to give consideration to "existing electorate boundaries", and in their report they say that:

Where possible, we have retained the existing electorate boundaries and names which are familiar to the public.

Clearly, reducing confusion for the public weighs into their considerations.

Second, the Commission is required to give consideration to "topographical features", and when it comes to Hamilton, the main topographical feature is the river, which neatly divides the city into East and West.

Other than total population (current and projected future), the Commission does not take into account demographic considerations. And it isn't clear to me that the result (in terms of representation) would be any better if they did. Do we really want socio-demographically homogeneous electorates, or electorates that are diverse? That's not a question I feel equipped to answer, and many people's answer would depend on their ideological preferences. And that's why we have an independent commission to make these decisions.

Read more:

Some notes on the 2018 Census data

Tuesday, 28 January 2020

The gender wage gap, when pay rates, job choice, and discrimination are not factors

There is a large and persistent gender gap in wages - men earn more than women. At least part of the gender gap is because men earn a higher hourly wage than women. However, what happens when a job is structured in such a way that hourly pay rates are gender neutral? Is there still a gender gap? That's the question that this job market paper by Valentin Bolotnyy and Natalia Emanuel (both Harvard University) seeks to answer.

Bolotnyy and Emanuel use human resources data for 3,011 bus and train operators working for the Massachusetts Bay Transport Authority (MBTA) in Boston. This workplace is interesting in that:

Seniority in one’s garage is the sole determinant of workplace opportunities, a feature enshrined in the collective bargaining agreement that covers all MBTA bus and train operators.3 Conditional on seniority, male and female operators face the same choice sets of schedules, routes, vacation days, and overtime hours, among other amenities.

So, if the gender pay gap is apparent in MBTA jobs (having controlled for seniority), it can't be because of discrimination. As the authors note:

...the major explanations [for the gender wage gap] cluster into four categories: Women tend to work in lower-paying jobs; women have less experience; women face workplace discrimination; and women may be less willing to fight for better compensation. Our setting allows us to rule out all of these explanations for the earnings gap that we observe.

Bolotnyy and Emanuel find that:

...despite having such a controlled setting, the MBTA still has a gender earnings gap: female operators earn $0.89 for each male-operator dollar in weekly earnings...

Mechanically, the earnings gap in our setting can be explained by the fact that male operators take 1.3 (49%) fewer unpaid hours off and work 1.5 (83%) more overtime hours per week than their female counterparts. Female operators’ choices indicate that they value time outside of work more than do male operators and that they have greater demand for schedule predictability and controllability.

In other words, the gender pay gap exists in this job, and it arises because of the different work choices of men and women. However, that's not the end of the story:

While female operators take fewer overtime shifts than male operators, the driver of this difference is overtime opportunities that arrive on short-notice and therefore demand that operators are flexible about when they work. When overtime is scheduled the day before or the day of the necessary shift, male operators work almost twice as many of those hours as female operators. In contrast, when overtime hours are scheduled three months in advance, male operators sign up for only 7% more of them than female operators.

So, women are less flexible in their ability to take on additional overtime hours at short notice. Specifically, they show that women are less likely to accept overtime shifts on days when they are not already working, and on weekends. And there's more:

When it comes to overtime hours worked, unmarried female operators with dependents work only 6% fewer of them when they are preplanned 3 months in advance, but about 60% fewer of them when they are offered on short-notice. Unmarried women with dependents also take the largest amount of unpaid time off with FMLA [Family Medical Leave Act], making them the lowest earners in our setting.

I don't think anyone will be surprised to learn that one of the biggest factors explaining that women do not take overtime hours at short notice is having dependents (usually, children). Societal expectations around the childcare responsibility of women are driving their 'choices' here, making them less flexible in their ability to take short-notice overtime, and contributing to the gender wage gap.

If overtime is a big contributor, one way to reduce the gender wage gap would be to reduce the amount of overtime. The MBTA did try reducing overtime (although their goal was to reduce costs and increase efficiency, rather than anything to do with the gender wage gap):

In an effort to reduce absenteeism and overtime expenditures, the MBTA implemented two policy changes: one that made it harder to take unpaid time off with FMLA and another that made it harder to be paid at the overtime rate. While the policy changes reduced the gender earnings gap from 11% to 6%, they also decreased both male and female operators’ well-being.

Reducing the gender wage gap came at a cost to both men and women. So, the gender wage gap exists even when discrimination would not be possible, but reducing the gap would come at a cost to everyone. A relevant question then is, how much cost would society be willing to bear, in order to reduce the gender wage gap? That's not an easy question to answer (in the same way as the similar question of how much cost society is willing to bear in order to redistribute income more generally).

[HT: Marginal Revolution, back in late 2018]

Read more:

Compensating differentials, preferences, and the gender gap in wages

Sunday, 26 January 2020

People don't know how much they can drink, relative to the drink driving limit

Some new research was highlighted by the New Zealand Herald this week:

Motorists are still confused about "ambiguous" drink-drive laws, with many believing they are fine to get behind the wheel after three drinks.

New research released today shows the average motorist has a significant knowledge gap when it comes to the laws around drinking and driving.

While nearly three-quarters of motorists are confident they understand the rules, only 22 per cent actually know the correct legal adult limit - and 20 per cent believe they can have three or more drinks before driving.

The findings suggest people are basing how much they drink around the number of drinks they think they can have, rather than blood alcohol content which can vary greatly from person to person.

The research was undertaken by DB Breweries, and despite spending a fair amount of time searching, I can't locate it anywhere online. However, even taken at face value based on news reports, it doesn't really add anything new, beyond my own research that was published in 2018 (as I blogged about here), but based on data we collected in 2014:

The headline result is that drinkers generally have no idea of the breath alcohol concentration.

Our research was better than the DB Breweries research that has been reported in the media, because even if someone believes they can have two drinks and still be under the drink-driving limit, the limit is measured in terms of breath alcohol concentration, not in terms of the number of drinks. So, it doesn't matter so much if people can't judge the number of drinks they can have; it matters much more if they can't judge their breath alcohol concentration relative to the limit. And, as the quote above from my earlier blog post notes, people really have no idea (and in our follow-up research last year, which I blogged about last week, we found that informally things have not improved).

We found that drinkers at or below the breath alcohol limit for driving tended to overestimate their breath alcohol concentration, which is probably a good thing if your goal is to reduce drink driving in total. Most people are moderate drinkers, and having moderate drinkers avoid driving therefore limits the total number of drink drivers.

However, the results at the top end suggested that the heaviest drinkers underestimated their breath alcohol concentration. Some of those drinkers even thought they were under the drink driving limit, which is a real worry.

The problem here is not that there are ""ambiguous" drink driving laws" (as the Herald claims). The law is not ambiguous. However, people's understanding of how the limit relates to the number of drinks they can consume and remain under the limit is problematic. There doesn't seem to be an easy solution to this either. The best advice is that, if you are drinking and you are unsure, you should not drive.

Read more:

Drinkers generally have almost no idea of their breath alcohol concentration

Friday, 24 January 2020

Learning regression models, in reverse

Understanding regression models is a core skill for economists. It takes some time to learn the mechanics of different regression models, but even then many students struggle with understanding what a significant model 'looks like'. What if you could teach (or learn) about regression models, not by starting with some data that you are given and running the model, but by trying to create the data that would lead to a particular result?

It sounds dodgy, because certainly we don't want to encourage people to make up their own data. However, being able to visualise what it takes for a model to be statistically significant at different levels of significance is important. And now you can do exactly this, at this website. It has a very simple point-and-click interface, and is supported by learning exercises that you can try for yourself.

There is also a working paper that briefly describes the app, by Luke Froeb (Vanderbilt University). Here's what he says about the website:

The app "inverts" the usual pedagogy. Rather than teaching students how to run regressions on data, it asks them to create data to achieve a given outcome, like a statistically significant line. Exercises are designed to give students an intuitive feel for the relationship between data and regression, and to show them how regression is used.

There's hours of fun to be had, playing with making your own data and trying to get it to fit increasingly tricky scenarios. Enjoy!

[HT: Marginal Revolution]

Tuesday, 21 January 2020

Pre-drinking and the night-time economy

I'm a little unusual among economists, in that I really like to get out and do fieldwork (ok, maybe not that unusual any more, given the Nobel Prize winners of last year do it as well). Some of my most interesting fieldwork experiences have come overseas, or interviewing drunk people in the night-time economy. In fact, I posted about some of my work on the latter back in 2018.

I was lucky enough to secure some research funding from the Health Promotion Agency to repeat similar work last year (in fact, given the number of my students I encountered, probably some of the readers of this blog remember seeing me out in town late at night, breathalyser in hand). That research, which is joint work with Matthew Roskruge (Massey University), Nic Droste and Peter Miller (both Deakin University), is now published on the Health Promotion Agency website.

This time around, we had three research questions in mind:

Where and when do pre-drinkers (people drinking before a night out or event) obtain their alcohol?;
What is the difference in the level of intoxication of pre-drinkers vs. non- pre-drinkers, and how does this difference vary over the course of a night?; and
Is the level of intoxication of pre-drinkers related to where and when they obtain their alcohol?

We also looked at the motivations for pre-drinking, and at the prevalence of side-loading behaviour (side-loading is the consumption of alcohol during a night out or event, occurring at a location other than a licensed venue). In this post, I'm just going to focus mostly on the first and third research questions, which I think have the most policy relevance (if you're more interested in the other research questions, then read the report).

The reason for looking at those research questions is easy to explain. There are lots of intoxicated people out and about in the night-time economy, and there's a lot of alcohol-related harm that arises from this. If you want to reduce alcohol-related harm, then one way is to try to reduce the amount of drinking. However, it isn't clear where policy should be directed. The bar owners will tell you that the problem is pre-drinking - people get drunk before they come into town for the night, and then cause problems. In that case, you may want to target policy at the off-licence outlets, since they are the main cause of the problems. However, the off-licences argue that making them close earlier wouldn't be effective, because people plan ahead and buy their drinks for pre-drinking ahead of time.

So, which side is correct? It turns out both. Pre-drinking is a big contributor to the level of intoxication in the night-time economy (we showed that in our earlier research, and again in this work). So, the bar owners are right.

However, when we looked at where and when pre-drinkers were buying their alcohol for pre-drinking, we found that the majority of pre-drinkers purchase their alcohol for pre-drinking on the day that they consume it. However, more than half of them were purchasing sometime before 6 p.m., so to have an appreciable impact on pre-drinking by modifying off-licence trading hours, you'd have to make the off-licences stop selling alcohol awfully early in the evening. It would be hard to make the case for that as a policy solution.

One of the biggest motivations for pre-drinking is price, which we found here, and which has also been found many times in international research. Alcohol is much less expensive when purchased from an off-licence and consumed at home, or on the way into town, than it is when purchased at a club or bar. If curbing pre-drinking is an important means of reducing alcohol-related harm, then it seems more feasible to try to reduce the price differential between bars and off-licences, than to mess with off-licence trading hours.

One of the great ironies of the Sale and Supply of Alcohol Act, which came into force in December 2013, was that it prohibited alcohol outlets from selling low-priced drinks. On the surface, this makes sense. People drink more when alcohol is less expensive (that's the simple downward-sloping demand curve at work). However, in practical terms, prohibiting low-priced drinks killed the 'happy hour' at bars and clubs, which was often early in the evening. Being able to buy cheap drinks in happy hour encouraged at least some people to come into town early. [*] Without happy hours, the incentives change in favour of drinking at home and coming into town later in the evening. I'd argue that this change has probably contributed to a continuing increase pre-drinking behaviour.

So, if reducing alcohol related harm through curbing pre-drinking is a policy goal, looking at how alcohol is priced is important. I'm not necessarily arguing for a return of the happy hour, in order to reduce the price differential between bars and off-licences. However, the obvious alternatives are either: (1) increasing excise taxes at off-licences but not bars, which seems unnecessarily complicated (especially since there are bars that also have an off-licence); or (2) minimum unit pricing, which would affect off-licences but be unlikely to affect bars. More on that in a future post.

*****

[*] Not everyone, obviously. And let's be clear - pre-drinking is not a new behaviour, as it was common when I was an undergraduate student.

The temporal gradient of intoxication in the night-time economy

Monday, 20 January 2020

Marijuana legalisation and local crime rates

In my final post of last year, I talked about a research paper on the effect of marijuana legalisation on drug dealers. The overall conclusion of the paper was that marijuana legalisation increased recidivism of marijuana dealers, inducing them to switch to crime related to harder drugs:

Following legalization, marijuana offenders become 4 to 5 percentage points more likely to re-enter prison within 9 months of release. The effect is sizable, corresponding to a near 50% increase from a baseline rate of 10 percent. When decomposed by crime categories, I find the overall increase masks two countervailing effects. One, marijuana offenders became less likely to commit future marijuana offenses. Two, this reduction is offset by the transition to the trafficking of other drugs. As a result, the observed criminality of former marijuana traffickers increased.

However, that is definitely not the end of the story. I just read an article by Jeffrey Brinkman and David Mok-Lamme (both Federal Reserve Bank of Philadelphia), published in the journal Regional Science and Urban Economics (ungated earlier version here), where the conclusions seem to almost be the opposite. Brinkman and Mok-Lamme look at crime data at the census tract level in Denver, over the period from 2013 to 2016 (retail marijuana for recreational use became available in Colorado on 1 January 2014). They use an interesting identification strategy:

While the legalization of recreational marijuana in 2014 applied to the entire state, many municipalities within Colorado prohibit sales within their own jurisdictions. Residents living in municipalities near Denver that prohibit recreational sales often travel to Denver to purchase marijuana. Therefore, locations within Denver that have more access to demand from neighboring municipalities show more growth in their dispensary density, ceteris paribus. In addition, out-of-state tourists could purchase marijuana starting in 2014, further increasing the demand for dispensaries in locations with access to broader outside markets. In the empirical analysis, we use two geospatial variables to proxy for access to outside demand: a neighborhood’s proximity to municipal borders and proximity to major roads or highways. These variables are then used to instrument for changes in locations of dispensaries over time.

I was initially sceptical of this, because areas close to the outer border of Denver are further from the central business district, and consequently suffer less crime. However, their supplementary analyses, including where they show that the effect is unique to the period after recreational marijuana became available, convinced me. They find that:

...an additional dispensary per 10,000 residents is associated with a reduction of 17 crimes per 10,000 residents per month. The average number of crimes per 10,000 residents in Denver is 90 per month, so an additional dispensary is associated with roughly a 19 percent decline in crime.

The results from the supplementary analysis I mentioned (which is just one among many) suggest a smaller effect, on the order of 14 fewer crimes per 10,000 residents per month (a reduction of 16 percent). Some of their other results are interesting as well. For instance:

Dispensary densities after 2014 increased more in neighborhoods with higher poverty rates, with higher levels of employment, that are closer to the central business district, and where there is more useable land.

To be honest, we see something similar with off-licence alcohol outlets. In my own work, we have reasoned that more outlets locate in poorer areas because rents are lower, and because poorer residents are unable to (or unable to afford to) travel long distances to obtain alcohol. The latter effect leads to markets that are much more localised in poorer areas.

That marijuana dispensaries tend to locate in poorer areas leads to perverse effects on the standard OLS (ordinary least squares) regression model, which shows that marijuana dispensaries are in areas with more crime, even controlling for poverty and other neighbourhood characteristics. That raises questions about a lot of the research linking off-licence alcohol outlets with crime, where similar effects might be at play.

Finally, coming back to the overall results, this research shows that marijuana legalisation is associated with lower crime. That is the opposite conclusion to the paper by Heyu Xiong I discussed last month. They use different data sources and cover different regions (Xiong had Oregon and Washington states in his analysis, as well as Colorado). This study was purely based on the urban area of Denver (although a supplementary analysis they report at the county level for all of Colorado was suggestive of a negative effect as well).

Brinkman and Mok-Lamme test for spatial spillovers into surrounding neighbourhoods, and don't find any. One explanation that might link both studies is if the spatial spillovers are wider than that. Perhaps, in addition to moving to harder drug crime, former marijuana dealers are forced to move to other areas as well? Certainly, there is more work to be done in this area.

[HT: Marginal Revolution, last August]

Read more:

Marijuana legalisation and the displacement of drug dealers

Sunday, 19 January 2020

Happiness inequality, revisited

At the start of the month, I wrote a post about happiness inequality. The research paper I reference there did a poor job (I think) of measuring happiness inequality (using the standard deviation of happiness). I just finished reading this 2013 article, by Indranil Dutta (University of Manchester) and James Foster (George Washington University), published in the journal Review of Income and Wealth (appears to be gated, but there is a working direct link to the paper here), which does a much better job.

Dutta and Foster use data from the US General Social Survey from 1972 to 2010, which has 49,433 observations of happiness, all measured as responses to the question: "Taken all together, how would you say things are these days - would you say that you are very happy, pretty happy or not too happy?"

They discard the standard deviation as a measure, much as I did in my earlier post:

Variance or standard deviation is an unsatisfactory measure of inequality under a cardinal scale...

Instead, they use measures based on the median, rather than the mean or the standard deviation. It's possible that basing an inequality measure on the median may go some way to reducing the problems with happiness measures I highlighted in this post last week, because the median-based measure is scale invariant (it doesn't depend on how you weight the different 'levels' of happiness). I'm sure someone who is much more mathematically inclined than I am can get to the bottom of that question, and given the strength of the conclusions drawn by Bond and Lang against happiness measures, I'd say that it is already a priority for someone.

Anyway, using their median-based measures Dutta and Foster find that:

...happiness inequality decreased from its highest level in the 1970s, through the 1980s and 1990s. Only in the 2000s did it start to rise again. However, in 2010 there has been a remarkable decline in inequality, making it the year with the lowest inequality under the linear scale of the AF measure. This achievement is offset, to some extent, by the fact that the average level of happiness in 2010 turns out to be the lowest among all the years.

This is an interesting result, when you place it alongside the fact that over this period, inequality in incomes has been increasing in the U.S. So, there is increasing inequality in incomes alongside decreasing inequality in happiness, and a moderate decline in median happiness. It would be interesting to consider a model that fits those stylised facts.

They also find that:

...on average, women have higher happiness inequality relative to men.

Add that to the list of stylised facts to explain in a model of happiness and inequality. Why is happiness both declining and converging over time in the U.S.?

Read more:

Friday, 17 January 2020

The gender gap in reviewing and editing for top economics journals

I've written a number of posts about the gender gap in economics (most recently, this one; see the list at the end of this post for more). So, I was interested to read this article by David Card, Stefano DellaVigna (both University of California, Berkeley), Patricia Funk (Universita Della Svizzera Italiana), and Nagore Iriberri (University of the Basque Country), published in the Quarterly Journal of Economics (ungated earlier version here). In the paper, they look at the reviewing and editing process for four top economics journals (Journal of the European Economics Association, Quarterly Journal of Economics, Review of Economics and Statistics, and Review of Economic Studies), in terms of gender bias.

They have data on nearly 30,000 submissions to those four journals, which they use to:

...analyze gender differences in how papers are assigned to referees, how they are reviewed, and how editors use referee inputs to reach a revise and resubmit (R&R) verdict.

There is both good and bad in their results. First, they find that:

...female-authored papers receive 22 log points (std. err. = 0.05) more citations than male-authored papers, controlling for the referee evaluations. Our estimate of this gender gap is robust to alternative measures of citations and to a variety of alternative specifications...

What this means is that:

...female-authored papers would have to be of 28 log points (32%) higher quality than male-authored papers to receive the same referee assessment.

That gender gap is then perpetuated through the publication process:

On average editors tend to follow the referees’ recommendations, putting essentially no weight on author gender in their R&R decisions. This means that they are overrejecting female-authored papers relative to a citation-maximizing benchmark.

There are at least a couple of interpretations for what is going on here:

There are two main explanations for our finding that female-authored papers receive more citations, conditional on the referee evaluations. The first is that referees hold female authors to a higher bar, perhaps because of stereotype biases. The second is that female-authored papers have characteristics that lead to higher citations but are not as highly rewarded in the review process. For example, female authors may tend to write more empirically oriented papers, or concentrate on certain topics within broad field categories that referees undervalue relative to expected citations.

Card et al. find evidence that is suggestive that paper characteristics play some role. That is, female authors do concentrate on different fields than male authors, and those fields tend to attract more citations. The news isn't all bad though:

We find no gender differences in the time that referees take to return a recommendation, in the time that editors take to reach a decision, or in the time between submission and acceptance for published papers.

There's also no difference between female and male reviewers, so if male reviewers are holding female-authored papers to a higher bar, then so are female reviewers. However, they couldn't assess differences between female and male editors, because of a lack of female editors (!).

This paper also made me wonder whether there is a third explanation, which the authors did not identify. Female authors might self-censor, sending only their very best papers to these top journals, but sending their second-tier papers to other, lower-ranked journals. In contrast, male authors might send those second-tier papers to the top journals anyway. That would lead the average quality (as measured by citations) of female-authored papers to be higher than male-authored papers.

Regardless of explanation, the results have clear and negative implications for female economists, who have to work harder to get the same outcomes as male economists. And the solutions are not as straightforward as equalising gender participation. Card et al. note in their conclusion that:

One potential remedy to help female economists — using more female referees — is unlikely to help, given that female referees hold female-authored papers to the same higher bar as do male referees.

Their preferred solution is disappointingly vague (although admittedly, I'm not sure I can offer anything better):

It appears to us that a simpler path is to increase the awareness of the higher bar for female-authored papers. The referees and editors can then take it into account in their recommendations and decisions.

Of course, that means moving away from the double-blind reviewing process (which is a fiction in any case - but that's a topic for another post).

Read more:

Thursday, 16 January 2020

Several research articles on inequality

The Autumn 2019 issue of the journal Oxford Review of Economic Policy was a special issue on inequality. There were several good papers, which I just got around to reading. I'm not going to discuss them all in great detail, but here are a few things that I thought were interesting or important.

Thomas McGregor (IMF), Brock Smith (Montana State University), and Samuel Wills (University of Sydney) had an article on the measurement of inequality (sorry, I don't see an ungated version). They ask some very pertinent questions in the introduction:

Measuring inequality is therefore not straightforward because it first requires answering a series of questions. What variable do we care about? What population is the focus? And what properties of that variable’s distribution matter for our purposes, which we can summarize in a statistic? The answers to each of these questions will depend on the researcher’s purpose: there should be different measures, of different variables, for different goals.

In terms of variables, do we (or should we) care more about inequality in income, or wealth, or something else (wellbeing, perhaps?). In terms of population, should we worry more about global inequality, or inequality within each country? Global inequality makes sense, but it is only inequality within countries that we can effectively target with policy. Which is the best statistic to use to measure inequality is a huge can of worms that is probably best left to another post (or a series of posts).

Brian Nolan and Luis Valenzuela (both University of Oxford) have an article on the change in country-level inequality over time (sorry, no ungated version). In terms of changes over time, it is interesting to note that over the period from 1980 to 2007, New Zealand's inequality grew only slightly more than the OECD average that Nolan and Valenzuela report (although, it is worth noting, that the increase in inequality in New Zealand was heavily concentrated in the late 1980s and early 1990s - see for example this post). I particularly liked this bit from their conclusion:

The ‘grand narrative’ that a sustained rise in income inequality is driving stagnating real incomes around and below the middle, exacerbating social ‘bads’, and fuelling ‘revolt against the elites’ probably comes closer to reflecting the experience of the US than many other rich countries, but is not the whole story even there. More to the point, the US case is not representative of the experience of rich countries with respect to inequality and income growth over recent decades, which has been much more varied than this ‘grand narrative’ recognizes.

We see far too much of that 'grand narrative' in New Zealand, where it simply isn't true. This is a point that I try to hammer home in my ECONS102 class.

In terms of global inequality, Ravi Kanbur (Cornell University) has an article (ungated earlier version here) and he makes the case (in contrast to the policy-based argument above in favour of looking at country-level inequality) that:

...a country-by-country analysis, while important for establishing the basic facts, is incomplete for a world where countries are ever more knitted together by trade and investment and where, perhaps, our common humanity calls for an assessment of global inequality rather than national inequality in isolation.

He shows the (by now hopefully well known) fact that global inequality has been falling (see this post for example, or this one):

What explains this global trend of falling inequality? It is helpful to go back to the notion that world inequality is composed of inequality between nations and inequality within nations. Inequality between nations, that part of global inequality brought about by difference in average incomes of rich and poor countries, accounts for the bulk of global inequality... What happened between 1988 and 2008 (and, indeed, over the longer period of globalization before and after this period) is that poorer counties like India, China, and Vietnam grew much faster than rich countries like the US. The effect on between-nations inequality was so great that global inequality fell.

On policy, he concludes that:

We are thus left with the conundrum that addressing national-level inequality through national policies will be less effective unless cross-national agreements can be reached on a range of tax and investment issues. The weakness of global institutions in addressing these questions is surely another sense in which we are living in an age of rising inequality.

Finally, Davide Furceri and Jonathan Ostry (both IMF) have an article where they analyse the factors associated with country-level inequality (sorry, no ungated version), using a model averaging approach. They first note that:

The list of potential drivers of inequality includes, among others, the level of economic development, macroeconomic policies, structural reforms, and structural features such as demographics, technology, or institutions.

They then find that:

...there is not just one single factor that is the main robust driver of the level of inequality and its evolution, in contrast to what is sometimes alleged (with technology or trade looming large as possible mono-causal culprits). In the cross-section, we find that the level of development and demographics, as well as unemployment and globalization, play key roles. Interestingly, the effects of trade and financial globalization go in opposite directions, as inequality drivers. While trade globalization is associated with lower inequality, particularly in developing economies, financial globalization is associated with higher inequality.

It is particularly interesting to me that they find trade is associated with lower inequality. I'll have another post on that topic in the near future, relating to research with one of my PhD students.

If you are interested in inequality, and you have access to this journal, there is a lot of good information to be had from these four articles, as well as the others in the same special issue.

Wednesday, 15 January 2020

The impact of Sesame Street on education and employment

When I read Andrew Leigh's book Randomistas recently (which I reviewed here), I noted my surprise at the fact that Sesame Street had been subjected to a randomised controlled trial (or rather, more than one). So, when Tim Harford also wrote about Sesame Street last week, it prompted me to follow up. Harford was talking about this article by Melissa Kearney (University of Maryland) and Phillip Levine (Wellesley College), published in the American Economic Journal: Applied Economics (open access, but just in case an ungated version is available here). Long-time readers of this blog might remember my 2014 post about research on the MTV show 16 and Pregnant and its effect on teen pregnancy, which was also based on a paper by Kearney and Levine.

In the Sesame Street paper, they looked at how potential county-level exposure to Sesame Street affected education outcomes in early childhood and high school, and later employment outcomes. They exploit the fact that, despite its popularity, not everyone could watch the show when it first aired:

Following its introduction, Sesame Street was mainly broadcast on PBS channels; of the 192 stations airing the show, 176 of them were affiliated with PBS. The majority of these stations (101) were broadcast on UHF channels rather than VHF channels, which introduced technological constraints that limited exposure to the show. As we detail below, only around two-thirds of the population lived in locations where Sesame Street could be received on their televisions.

Their measure of coverage is based on "distance to the closest television tower broadcasting Sesame Street and whether that tower transmits using UHF or VHF. They then compare cohorts of people who entered first grade between 1959 and 1968 (before Sesame Street first aired) with cohorts who entered first grade from 1970 to 1974, who would have been more likely exposed to Sesame Street before they began school. They then use 1980 Census data to measure 'grade-for-age' (whether people are in the appropriate grade for their age, rather than being held back a year or more), 1990 Census data to measure completed schooling, and 2000 Census data to measure labour market outcomes. When comparing high coverage and low coverage areas, they found that:

...children who were preschool age in 1969 and who lived in areas with greater simulated Sesame Street coverage were statistically significantly more likely to be at the grade level appropriate for their age... A 30-point increase in coverage rates would generate a 3.2 percentage point (0.3 × 0.105 = 0.032) increase in the rate of grade-for-age status. With 20.3 percent of the sample behind their appropriate grade in school, this estimate implies that moving from a weak to strong reception county would lower that rate by around 16 percent...

This effect on grade-for-age status is particularly pronounced among boys. The estimated effect is largest (in absolute terms) for black, non-Hispanic children, but the estimated coefficients are not statistically significantly different across race/ethnic groups.

That effect is comparable in size to contemporary pre-school interventions such as Head Start. The effects are not persistent however, and are mostly gone by high school, where they find:

The results... regarding educational attainment provide no evidence of changes in these outcomes. Parameter estimates are small, statistically insignificant, and inconsistent with the expected pattern across cohorts...

Similarly, using data from the High School and Beyond survey, they find no effect of coverage on educational outcomes. Finally, they find not much of interest in terms of labour market outcomes either:

The results... regarding labor market outcomes suggest some small long-term labor market improvements. Parameter estimates all take on the expected signs: positive for employment and wages, negative for living in poverty. The estimated impact on employment is statistically significant at the 5 percent level.

The magnitude of these effects, though, is small... Our standard for interpreting magnitudes has been to evaluate the impact of moving from a weak reception county to a strong reception county, characterized by a 30 point increase in coverage. These results predict that employment would rise by about 1 percentage point (0.3 × 0.034).

In other words, Sesame Street appears to have good outcomes for young children, ensuring that they maintain grade-for-age status, but those effects are mostly gone by high school or when they are in the labour market later.

However, that might not be the end of this story. As with many studies of this type, migration creates a problem for the analysis. Categorising people's exposure to Sesame Street is necessarily uncertain, particularly if they may have been exposed in one county and then moved to another county that has a different coverage rate. Kearney and Levine try to address this by restricting their analysis to people still living in their state of birth, but migration still creates the potential for measurement error in the most important variable (coverage). Measurement error leads to attenuation bias (a tendency for estimated effects to be biased towards zero in statistical models). This attenuation bias is likely to be least problematic for short term outcomes (like grade-for-age) when not many people have migrated, but will increase over time and be a much larger problem for measuring the effects on labour market outcomes.

So, I don't think we can say for sure that Sesame Street has no long-term effects. If it were at all possible to follow up on some of the participants in the early randomised controlled trials of Sesame Street, we might be able to get a better sense of this. In the meantime, we have only some suggestive evidence of impact beyond the early school years.

Read more:

Book review: Randomistas

Tuesday, 14 January 2020

Cryptocurrency company name changes; and the energy costs of bitcoin mining

Back in September last year, I wrote a post about the effects on share prices of re-naming a company to include 'Blockchain' in its name:

In other words, the companies that changed name were not doing well (in terms of share price) in the lead up to their name change. They then saw a massive increase in their share price, up to 30 days after changing name, presumably as investors looking to jump on the cryptocurrency bandwagon bought into the companies. Then the share price started falling back to its original level.

The paper I referred to, by Jain and Jain, was published in the journal Economics Letters. The latest issue of the same journal has another article in a similar vein (sorry I don't see an ungated version online, but it appears that it is open access for now), this one authored by Prateek Sharma (IIM Udaipur), Samit Paul (IIM Calcutta), and Swati Sharma (Jawaharlal Nehru University). They build on the earlier Jain and Jain paper, but extend the analysis in several ways, most importantly by: (1) extending the sample from 10 company name changes to 39; and (2) considering a comparison group of crypto-currency-related companies that did not change names, and another comparison group of non-crypto-currency-related companies that did change names. The comparison groups were matched to the sample in terms of share price, market capitalisation and value.

Looking at share returns, they find that:

...the share price of the sample firms increases significantly following the name change announcement... The most dramatic price increase occurs in the first couple of days, as the average share price increases from $2.24 on day −1 to $3.26 on day +1. We find no evidence of a post-announcement negative drift in share prices. The average share price is $4.07 on day +10, $4.36 on day +30 and $4.65 on day +50.

Comparing with the comparison groups, they find that:

These abnormal returns cannot be explained by industry factors, as the sample firms generate significantly higher abnormal returns than those generated by the sample of matched cryptocurrency firms that did not change their names... the sample firms experience larger and more permanent changes in value compared to similar noncryptocurrency firms that changed their corporate names during the sample period.

A rose by any other name may smell as sweet, but a company that includes blockchain in its name is clearly sweeter (apologies to The Bard).

Another article in the same issue of that journal also caught my attention, because it also relates to bitcoin. Debojyoti Das (Woxsen School of Business in India) and Anupam Dutta (University of Vaasa in Finland) look at the correlation between bitcoin miners' total revenue (not individual-level data, but revenue in total for all bitcoin miners) and energy usage, using data from February 2017 to March 2019. This is a low-quality paper, not least because they use quantile regression when it is totally unnecessary, so I won't dwell on it too much. Moreover, the results are surprising and under-explained, which also makes me wonder why this study was published in this form.

Das and Dutta find a negative correlation between miners' revenue and energy usage. In other words, total revenue (for all miners collectively) is high when their energy usage is low, and total revenue is low when energy usage is high. I would have thought that, when miners were engaged in more activity, revenue would be high and so would energy usage. That would suggest the correlation should be positive, not negative. However, that assumes that we are holding the price of bitcoin constant. These results basically suggest that the price of bitcoin is negatively correlated with energy usage, and it is hard to see why that would be. This is definitely a study that requires revisiting, and with more appropriate quantitative methods.

Read more:

The share market effects of naming companies after blockchain

Sunday, 12 January 2020

Immigration, native employment, and political backlash

Recent waves of immigration have caused political debate in many western countries. However, this isn't the first time that immigration has had this effect. In a new article published in the journal Review of Economic Studies (ungated earlier version here), Marco Tabellini (Harvard Business School) looks at the period from 1910 to 1930 in the U.S., a period:

...when the massive inflow of European immigrants was abruptly interrupted by two major shocks, World War I (WWI) and the Immigration Acts (1921 and 1924)... Also at that time, anti-immigration sentiments were widespread, and the introduction of immigration restrictions was advocated on both economic and cultural grounds.

The paper is very detailed, and makes use of a combination of Census and other data. Tabellini first finds that:

...cities cut public goods provision and taxes in response to immigration... the reduction in tax revenues was entirely driven by declining tax rates, while the fall in public goods provision was concentrated in categories where either inter-ethnic interactions are likely to be more salient (e.g. education) or poorer immigrants would get larger implicit transfers (e.g. sewerage, garbage collection). These findings suggest that immigrants were perceived as a fiscal burden, and that immigration reduced natives’ demand for redistribution.

So, immigrants were perceived as a burden on the taxpayer. Next, he finds that:

...immigration reduced the pro-immigrant party’s (i.e. Democrats) vote share, and was associated with the election of more conservative representatives... most directly reflecting natives’ demand for anti-immigration policies, members of the House representing cities more exposed to immigration were significantly more likely to support the National Origins Act of 1924, which put an end to the era of unrestricted immigration to the U.S.

So, the immigration backlash was reflected in political outcomes. Tabellini then investigates why there was such a backlash:

I start from the first, and perhaps most obvious possibility: immigrants might have increased labour market competition, lowering wages and raising unemployment among native workers. Yet, in contrast with this idea, I find that immigration had a positive and statically significant effect on natives’ employment. My estimates are quantitatively large, and imply that a 5 percentage points increase in immigration (roughly one standard deviation) increased natives’ employment by 1.4 percentage points, or by 1.6% relative to its 1910 level.

In other words, there was a backlash against immigrants in spite of their positive impact on natives' employment (to be clear, 'natives' here refers to the U.S.-born population, not to Native Americans). Tabellini then goes further, and shows that:

These results were made possible by two mechanisms. First, immigration increased firms’ investment and productivity, generating an outward shift in labour demand. Second, because of complementarity, natives moved away from occupations that were more exposed to immigrants’ competition and specialized in jobs where they had a comparative advantage and, because of discrimination, immigrants did not have access to.

So, immigration increased labour demand, and allowed natives to move into other occupations that had higher status. The last part of the article asks:

...why, if immigration was on average beneficial and had no tangible economic costs, it nonetheless triggered political backlash. I show that natives’ political reactions were increasing in the cultural distance between immigrants and natives, suggesting that backlash may have had, at least in part, non-economic foundations...

Only Catholic and Jewish, but not Protestant, immigrants induced cities to limit redistribution, favoured the election of more conservative legislators, and increased support for the 1924 National Origins Act.

In other words, the backlash against immigrants was driven by cultural, and not economic, reasons. You might think that things are different today (or, maybe you don't). However, a 2016 article published in the journal The World Economy (sorry, I don't see an ungated version online), by Vincent Fromentin, Olivier Damette (both University of Lorraine), and Benteng Zou (University of Luxembourg) looked at the impact of the Global Financial Crisis on native and foreign-born workers in four European countries (France, Germany, Spain, and the UK). They used quarterly data on the numbers of native-born and migrant workers by gender, skill level, and industry sector, covering the period from 2008 to 2012 - like the 1910-1930 period, the Global Financial Crisis was also a period of great migration upheaval. Fromentin et al. find that:

...statistically significant effects of the immigration shock on native-born worker employment rates between 2008 and 2012.

However, the effect differs by country, and is not in the direction that many people would expect. They conclude that (emphasis mine):

...the empirical results suggest that the immigration shock’s effects on native-born worker employment rates have been persistent and very weak over the business cycle. When making distinctions according to gender and levels of qualification, we find some major differences between the countries examined. It appears that variations among immigrant workers in France, Germany, the UK and Spain affect native workers of all skill levels. We note that this effect is globally positive.

People might think that immigration is too high, but based on these studies (and consistent with the famous study of the Mariel boatlift in the U.S., by David Card), it would be difficult to do so for economic (especially labour market) reasons. The labour market impacts of immigration on the native-born population appear to be positive, not negative. We should therefore recognise that most political backlash against immigrants is necessarily nativist.

Thursday, 9 January 2020

Happiness is dead?

I've written a number of posts about studies where the main dependent variable of interest is some form of happiness (or, more broadly, subjective wellbeing). There are various ways that subjective wellbeing is measured, but they are all some variant of asking people to rate how satisfied (or happy) they are on a scale, which might be 0-10, 0-5, or might have labelled categories (e.g. very happy; happy; neither happy nor unhappy; unhappy; or very unhappy). Studies then either compare groups of people in terms of the average subjective wellbeing, or investigate the factors that are associated with average subjective wellbeing.

However, it turns out that the whole field of happiness studies might be completely bogus (or, at least, questionable). In a new article published in the Journal of Political Economy (ungated earlier version here), Timothy Bond (Purdue University) and Kevin Lang (Boston University) demonstrate that the main conclusions drawn by basically the entire literature that uses subjective wellbeing are suspect.

Their argument is fairly mathematical:

There are a large (possibly infinite) number of states of happiness that are strictly ranked. In order to calculate a group’s “mean” happiness, these states must be cardinalized, but there are an infinite number of arbitrary cardinalizations, each producing a different set of means. The ranking of the means remains the same for all cardinalizations only if the distribution of happiness states for one group first-order stochastically dominates that for the other.

Essentially, the problem boils down to how you convert a scale like those used to measure subjective wellbeing into a number that can be used for quantitative analysis. It might seem straightforward when the scale is 0-10, but that assumes that the numbers themselves are meaningful. That is, it assumes that the difference between a 2 and a 3 is the same as the difference between a 7 and an 8. It also assumes that every person in the sample rates the scale the same, so that a 7 is the same for everyone. Neither of these assumptions is necessarily true, and things get even worse when you are using a labelled scale rather than a numerical one.

In that case, if you are ranking two groups (A and B) in terms of their average (mean) happiness, and you think that A is the happier group, you can only be sure that you get the same ranking if the rank order of everyone in A is always higher than B. In other words, every percentile of the distribution of A must be happier than the same percentile of the distribution of happiness for B. That's a very strong requirement.

In fact, since this strict requirement is almost never observed in practice, Bond and Lang go on to demonstrate that some of the key results from the happiness literature can be reversed if you make different assumptions about how the scales are converted into numbers (emphasis is mine):

...we never have rank-order identification and can always reverse the standard conclusion by instead assuming a left-skewed or right-skewed lognormal...

Thus if researchers wanted to draw any conclusions from these data, they would have to eschew rank-order identification. In other words, they would have to argue that it is appropriate to inform policy based on one arbitrary cardinalization of happiness but not on another or, equivalently, that some cardinalizations are “less arbitrary” than others... we further show that nearly every result can be reversed by a lognormal transformation that is no more skewed than the wealth distribution of the United States... Even within this class of distributional assumptions, we cannot draw conclusions stronger than “Nigeria is somewhere between the happiest and least happy country in the world” or “the effect of the unemployment rate on average happiness is somewhere between very positive and very negative.”

Yikes! They conclude that:

It is essentially impossible to rank two groups on the basis of their mean happiness using the types of survey questions prevalent in the literature.

And also:

Certainly calls to replace GDP with measures of national happiness are premature.

It will be interesting to see how pro-happiness researchers respond to this attack on the very foundations of their work.

[HT: Marginal Revolution, back in 2018 when this was still a working paper]

Wednesday, 8 January 2020

Book review: Can You Outsmart an Economist?

I enjoy books of puzzles or brainteasers. Often, the answer is staring you in the face, and only becomes obvious once it is revealed. I felt that way a lot when reading Steven Landsburg's Can You Outsmart an Economist?

This isn't your traditional book of puzzles though. It does have some of the usual fare, but mostly they are designed and grouped together in order to teach some basic economic understanding. I found that aspect of the book the most interesting, but simultaneously the most off-putting. The puzzles and solutions are good, and the transitions between them are logical and well explained. However, Landsburg has an at times rather arrogant style of writing that will turn off some readers. I feel like I can write that with some authority, since a journal reviewer once claimed that I wrote in "the style of an arrogant economist".

Landsburg is good at picking where the general reader will get the puzzles wrong. But not always - in the section on strategy, when it is obvious that one of the games is repeated, I suspect that some readers will genuinely outsmart the economist.

I found a few points of interest to me in the book, aside from the puzzles themselves. The section on the difference between shared knowledge and common knowledge was interesting, as was the description of a 'zero-knowledge proof'.

However, this book isn't going to be to everyone's liking. If you are an economist, or open to developing a deeper understanding of the application of economics to, often abstract, puzzles, then this book is for you. Otherwise, I would recommend giving it a miss.

Tuesday, 7 January 2020

Democracy and economic growth

There have been many studies of the impact of democracy on economic growth, with widely varying results. What conclusion should we draw from such a broad and contradictory body of evidence?
A new article by Marco Colagrossi, Domenico Rossignoli, and Mario Maggioni (all Università Cattolica del Sacro Cuore in Italy), published in the European Journal of Political Economy (open access, but just in case here is an earlier ungated version), uses meta-analysis and meta-regression to provide an answer.

Meta-analysis is a method of combining the results of many previous studies to generate a single (and usually more precise) estimate of the effect size. In this case, Colagrossi et al. combine the results of 2047 regression models from 188 different papers, and find that the effect of democracy on economic growth is:

...positive and strongly significant (p < 0.01) in all meta-analytic models.

This is despite only one-third of individual estimates being positive and statistically significant. The effect is sizeable, being about one-third of the size of the effect that human capital has on economic growth. The authors then go on to use meta-regression to investigate which features of the different estimates are most associated with finding a positive effect of democracy on economic growth. They find that:

Effect sizes are mostly driven by spatial and time differences in the sample, indicating that the democracy and growth nexus is largely dependent on the world’s regions and periods considered.

More specifically, if the sample includes data from sub-Saharan Africa or from high-income countries, then it is more likely that a positive relationship between democracy and economic growth will be found. The opposite (less likely to find a positive relationship) is true if the sample includes countries from South Asia. They authors reason that:

In South Asia, the lobbying power of some labour and industrial groups can lead to an inefficient investment allocation in democratic regimes promoting rent-seeking behaviours and, consequently, economic inefficiencies at the aggregate level. Against this background, authoritarian political elites can have the autonomy needed to promote economic growth without being restrained by rent-seekers’ pressures....

In terms of time periods, if the sample includes observations from the 1960s, the 1970s, or the 2000s, it is less likely that a positive relationship between democracy and economic growth will be found. The authors note that:

This result is consistent with the fact that during the 1960s and part of the 1970s a relevant subset of democratising countries was experiencing the decolonisation phase. Thus, despite a formal increase in their democracy levels, they were also experiencing economic turmoils, hence low (or even negative) growth rates. The 2000s crises, as well the economic booming of autocratic China, drive instead the negative and significant coefficient of this dummy. Conversely, including the 1980s largely increases the probability of obtaining a positive relationship. The gradual stabilisation of the decolonisation processes, and the begin of the downturn of the Soviet block, could be interpreted as a golden age of the democracy and growth relationship.

Overall, this paper is a model for how a meta-analysis and meta-regression should be undertaken. The methods and results are very well explained throughout. Even if you are not interested in this particular topic, if you are thinking about meta-analysis, then this paper would be an exemplar to follow. On top of that, it answers an important question: it does appear that democracy is associated with higher economic growth. [*]

*****

[*] When they test for the effect of controlling for endogeneity within their meta-regressions, they find that the effect size gets larger. That provides some evidence that this relationship (from democracy to economic growth) may be causal.

Monday, 6 January 2020

Confirmation bias and climate change maths

Most academics would like to think (or hope?) that people form opinions on the basis of some form of evidence. We might disagree on what constitutes appropriate evidence, but we'd like to think that if people are presented with sufficient evidence that runs counter to their established opinions, they might change their opinion. Of course, this flies in the face of confirmation bias - the idea that people selectively interpret information, readily accepting and remembering information that confirms pre-existing beliefs and opinions, while dismissing and quickly forgetting information that challenges those beliefs and opinions.

Although it doesn't refer to confirmation bias (at all), this article in The Conversation by Will Grant (Australian National University) shows just how pervasive confirmation bias can be. The article refers to this article published in the journal Environmental Communication last year (sorry, I don't see an ungated version online), by Matthew Nurse (also Australian National University) and Will Grant.

Nurse and Grant asked people to solve a maths problem based on contingency tables, in order to answer a question about whether the data shows that something got better, or worse. There were two contexts: (1) a new skin cream, and its effect on a rash; and (2) the closure of coal fired power stations, and their effect on carbon dioxide emissions.

Solving a contingency table correctly is not easy. It requires a certain level of mathematical literacy. Here's the table that people were presented with (there were actually four versions, two each for skin creams and power stations, and two each where the correct answer was that things got worse, and where things got better):

In this table, for 223 patients who used the skin cream, the rash got worse, while for 75 patients who used the skin cream, the rash got better. So, of those who used the skin cream, the rash got better for 25.2% (75/[75+223]). For 107 patients who didn't use the skin cream, the rash got worse, while for 21 patients who didn't use the skin cream, the rash got better. So, of those who didn't use the skin cream, the rash got better for 16.4% (21/[21+107]). The table should provide evidence that the skin cream works.

Now, if the numbers were reversed, that should provide evidence that the skin cream does not work. Similarly, depending on which way the numbers are presented, there should be evidence either in favour of closing power stations reducing carbon dioxide emissions, or not.

Given that the numbers are identical in all four cases, and bearing in mind that solving this is reasonably challenging, you would expect that similar percentages get the correct answer no matter which version they are presented with. Unfortunately, that wasn't the case.

Nurse and Grant tested this on 504 Australians, half of whom were supporters of the Australian Greens (ideologically far left), and half who were supporters of the One Nation Party (ideologically far right). It turns out that political views affected the proportion who got the answer correct, but only for the climate change context. Here's Figure 2 from the paper:

In each panel, the two bars on the left show the proportion who got the answer correct, and the two bars on the right show the proportion who got the answer wrong. The green bars are supporters of the Australian Greens, and the yellow bars are supporters of the One Nation Party. The top two panels are the skin cream context, and you can see that (especially for the right panel) the proportion getting the answer correct doesn't appear to depend on political affiliation. The bottom two panels are the climate change context, and it shows that supporters of the Greens are much more likely to get the answer correct if the correct answer is that carbon dioxide emissions decrease when power plants are closed, while supporters of One Nation are much more likely to get the answer correct if the correct answer is that carbon dioxide emissions increase when power plants are closed.

That isn't the end of the story though. Nurse and Grant calculated the difference in the odds of getting the correct answer between the two ideologies, for different levels of numeracy. You might expect that more numerate people would be less likely to be swayed by their political ideology. However, controlling for numeracy, the opposite appeared to be the case. For instance, they report that:

...a One Nation supporter with a numeracy score of three in the identity threatening “CO2 does decrease” condition was 26 per cent as likely to respond with the correct answer (odds ratio 0.26, P < .01) compared to a Greens supporter in the same numeracy category. However, in this condition, a One Nation supporter with a numeracy score of seven was only 5 per cent as likely to provide the correct answer as a Greens supporter in the same numeracy category (odds ratio 0.05, P < .01).

Nurse and Grant argue that this represents 'motivated reasoning'. From The Conversation article:

These findings build on the theory that your desire to give an answer in line with your pre-existing beliefs on climate change can be stronger than your ability or desire to give the right answer.

In fact, more numerate people may be better at doing this because they are have more skills to rationalise their own beliefs in the face of contradictory evidence.

This paper provides some discouraging news for those of us who hope that we can convince people with evidence. However, it isn't usually the average voter that we are trying to convince; rather it is policy makers, business people, etc. It would be interesting to see whether this study would replicate among those groups.

Sunday, 5 January 2020

Book review: Randomistas

It seemed appropriate to follow up reading Banerjee and Duflo's Poor Economics (which I reviewed here last month) with Andrew Leigh's Randomistas. Where Banerjee and Duflo focused their book on experiments in development, Leigh ranges much wider. The subtitle is "How radical researchers are changing our world". I'm not sure I would go as far as 'radical', but the book does provide a useful overview of the use of randomised controlled trials in a range of applications, from medicine to education, crime and justice, poverty and development, agriculture, and even politics and philanthropy. This is a good place to start if you want to see how widely randomised trials are being used.

I like the way that Leigh writes, and especially this bit of motivation for experiments:

In the film Sliding Doors, we follow the life of Gwyneth Paltrow's character, Helen, according to whether or not she manages to catch a train... What makes Sliding Doors a fun movie is that we get to see both pathways - like rereading a Choose Your Own Adventure book. We get to see what economists call the 'counterfactual' - the road not taken...

Researchers have spent years thinking about how best to come up with credible comparison groups, but the benchmark to which they keep returning is the randomised trial. There's simply no better way to determine the counterfactual than to randomly allocate participants into two groups: one that gets the treatment, and another that does not.

It is surprising to see the variety of randomised trials that have been employed. Like me, I bet you didn't realise the link between Sesame Street and randomised trials:

In its first year, Sesame Street was evaluated in a randomised trial, which compared a treatment group (children who were encouraged to watch the program) with a regular control group. Unfortunately, the researchers hadn't reckoned on the show's popularity. With more than one-third of American children tuning in to each episode, there wasn't much difference in viewing rates between the two groups...

So the next year, researchers took a different approach - focusing on cities where Sesame Street was only available on cable and then randomly providing cable television to a subset of low-income households... Children who watched Sesame Street had the same cognitive skills as non-viewers who were a year older.

Leigh writes in an engaging style, and while not all of the content is new (especially if you have read Poor Economics), there is plenty to retain the reader's interest. The subtitle of the book even resulted from an experiment!

Leigh also doesn't shy away from criticisms of randomised trials, and takes note of the replication crisis in social science, to which randomised trials have not been immune. I felt like this particular part of the book could have been expanded more. However, I was interested to note that Leigh makes explicit mention of one experiment that has been (and continues to be) widely reported - Sheena Iyengar and Mark Lepper's 2000 study of the paradox of choice (ungated), in relation to jam purchases (see also this article by Barry Schwartz, who wrote the book The Paradox of Choice):

A decade after the initial study appeared, a team of psychologists collated as many of these replication studies as they could find... Among the fifty replication studies, a majority went in the opposite direction from the original jam choice experiment. Averaging all the results, the psychologists concluded that the number of available options had 'virtually zero' impact on customer satisfaction or purchases.

I really enjoyed reading this book, especially as a companion to Poor Economics. Recommended for your late holiday reading list!

Thursday, 2 January 2020

Congratulations Dame Marilyn Waring

I was delighted to read earlier this week that Marilyn Waring was made a Dame Companion of the New Zealand Order of Merit, for services to women and economics. It may be a little surprising to some for her award to make reference to economics, as her original tertiary studies were in political science (although similarly, Elinor Ostrom won the Nobel Prize in economics, despite being considered by most to be a political scientist). Moreover, she may be best known to New Zealanders as a National Party MP from the 1980s.

However, internationally Dame Marilyn is known as one of the founders of feminist economics, which focuses on a number of topics, particularly around gender, which had been mostly neglected in the economics mainstream up to the 1980s (and many would argue, since then as well). Her first major contribution to economics was the book If Women Counted, which is a well-developed critique of GDP. In particular, the book focuses on how women's unpaid work is under-valued, because GDP only takes account of market production. The argument goes that if only market production is counted, then women don't count. Although gender equality has arguably improved since the 1980s, the broader point about the limitations of GDP remain (and were not new when Waring made them, although the gender dimension was new).

Randomly, I met Dame Marilyn in Vienna in 2010, at the International AIDS Conference, where she was giving a talk. I don't remember the details of her talk (and I'm away from the office, so unfortunately I can't even look it up). I do remember that many of the New Zealanders at the conference, including Dame Marilyn, had a lovely dinner together one evening during the conference (and a good thing to, because the other thing I remember about the conference was how expensive it was to eat in Vienna!).

This is a well-deserved and probably overdue honour. Congratulations Dame Marilyn Waring!

Wednesday, 1 January 2020

Is there a happiness Kuznets curve?

The Kuznets curve is the inverted-U-shaped relationship between inequality and income per capita:

It was proposed by 1971 Nobel Prize winner Simon Kuznets. Kuznets' theoretical explanation for this relationship was that, at low levels of development, inequality was relatively low. Then, as a country developed, the owners of capital would be the first to benefit because of the greater investment opportunities that the economic growth provided. This would lead to an increase in inequality associated with economic growth. Eventually though, at even greater levels of development the taxes paid by the capitalists would increase, leading to developments such as a welfare state, improved education and healthcare, all of which would improve the incomes of the poor. So, at higher levels of development, inequality would decrease with economic growth. There is a fair amount of support for this relationship (for example, see my 2017 post on this topic).

Income per capita is one measure of wellbeing, albeit one that doesn't capture all aspects of wellbeing. An alternative measure that is increasingly being suggested is subjective wellbeing, or happiness. So, given that there is an inverted-U relationship between inequality and income per capita, it is reasonable and important to ask whether there is a similar relationship between happiness and 'inequality in happiness'.

That is the research question that this 2017 article, by Rati Ram (Illinois State University), published in the journal Economic Modelling (sorry I don't see an ungated version online), seeks to answer. Ram uses several sources of data on cross-national averages and standard deviations of happiness, and runs regressions to test for a quadratic relationship (a relationship that would show an inverted-U shape as in the diagram above). The measure of happiness is based on the Cantril Ladder, which measures subjective wellbeing on a 0-10 scale.

Using this country-level data, Ram finds that:

...it is evident that there is clear evidence of a Kuznets-type inverted-U relation between mean happiness and happiness-inequality represented by standard deviation...

The "turning point" implied by the specification... occurs when mean happiness is 4.93.

However, there is a serious problem with this analysis, and it results from the use of standard deviation as a measure of inequality of happiness.

Imagine all of the possible country-level distributions of happiness, measured on a 0-10 scale. On average, say that country-level happiness has an average of about 5.5 (in Ram's study, his three measures range from 5.38 to 5.91). Some of the country-level distributions have a high mean (higher than 5.5), and some have a low mean (lower than 5.5). And in each case, the distribution of each person's happiness is spread around the mean. And that spread can be measured by the standard deviation.

Now, think about the distributions that have a mean that is higher than 5.5. By definition, the highest possible mean is 10. As the mean gets larger, the top half of the distribution becomes closer to the mean (because nobody's happiness can be higher than 10). The result is that the standard deviation becomes smaller.

Similarly, for distributions that have a mean that is lower than 5.5. As the mean gets smaller, the bottom half of the distribution becomes closer to the mean (because nobody's happiness can be lower than 0). The result is that the standard deviation becomes smaller.

Now think about this overall. As the country-level mean happiness moves away from the overall mean value of 5.5, in either direction, the standard deviation will get smaller. This isn't because there is a Kuznets curve relationship; it is simply a mechanical result of the way that standard deviation is related to the mean, when the distribution is truncated at each end.

There may be a Kuznets curve relationship between happiness and inequality in happiness, but this study doesn't tell us anything about whether it exists. Standard deviation is not the right measure to use. A better measure of happiness inequality is required, which doesn't have a mechanical relationship with average happiness. I wonder if a Gini coefficient might work, or a Thiel Index? Both are more usual measures of inequality than the standard deviation. Identifying a better measure, and testing the relationship using the better measure, might make for an interesting Honours or Masters research project for a good student.