Tuesday, 28 June 2022

Social disorganisation and crime over two centuries

Social disorganisation theory is the idea that differences (or changes) in family structures and community stability are a key contributor to differences (or changes) in crime rates between different places (or times). It suggests that neighbourhoods that are more unstable (higher social disorganisation) will have higher crime rates than neighbourhoods that are less unstable (lower social disorganisation), ceteris paribus (holding everything else constant). It also suggests that if neighbourhoods become more unstable over time, crime rates will increase, ceteris paribus. I've mostly encountered social disorganisation theory in relation to the effect of alcohol outlets on crime. In that context, things become tricky, because alcohol outlets tend to locate in more unstable (higher social deprivation) neighbourhoods, which may also have higher crime because of social disorganisation. So, disentangling the effects of alcohol outlets from the effects of social disorganisation more generally is difficult.

The broader literature on social disorganisation theory faces issues as well, because of potential reverse causation and endogeneity. If you want to test the effect of neighbourhood instability on crime, you have to recognise that not only can instability cause crime, but crime can cause instability. Finding ways of dealing with these issues is important.

So, I was interested to read this recent article by Zeresh Errol (Monash University), Jakob Madsen (University of Western Australia), and Solmaz Moslehi (Monash University), published in the Journal of Economic Behavior and Organization (sorry, I don't see an ungated version online). They use annual data covering the period 1840 to 2018 for 16 countries: Australia, Belgium, Canada, Denmark, Finland, France, Germany, Ireland, Italy, Japan, the Netherlands, Norway, Sweden, Switzerland, the U.K. and the U.S. Their data includes crime rates (per 100,000 population), family structure (proxied by the divorce rate and the share of out-of-wedlock births), community structure (proxied by the urbanisation rate), GDP per capita, and the proportion of the population aged 15-29 years. The crime rates are disaggregated into property crime, violent crime, homicide, robbery and assault. Errol et al. then look at the relationship between the family structure and community structure variables and crime rates (controlling for the other variables, along with country and time fixed effects).

However, remember that endogeneity is a problem here for the family structure variables, so any simple regression analysis is not going to tell us about the causal relationship between family structure and crime. Errol et al. solve this problem using instrumental variables analysis. Essentially, this involves finding an instrument that affects the endogenous variable (divorce or out-of-wedlock birth rate), but has no direct effect on the outcome variable (crime rates). In a unique twist to this paper, Errol et al. use the weighted average of the family structure variables in all other countries as instruments for the family structure variables in country i. For example, for Australia's divorce rate, they use as an instrument the weighted average of the divorce rates in all other countries. The weighted average of other countries' rates is a valid instrument because divorce rates should be related across countries, but the divorce rate in Australia should not exert any effect on the crime rate in Belgium (or any other country). You can tell a similar story for out-of-wedlock birth rates.

Now, rather than using the straight average of all other countries, Errol et al. use a weighted average, where the weights are based on the linguistic distance between the countries. So, data from countries where the main language is more similar will have a greater weight in the calculation of the average. This is quite an exciting aspect of the article to me, because it relates closely to some ongoing work I have been doing on using cultural distance measures in new and exciting ways (more on that in future posts).

Looking at the IV regression results, Errol et al. run separate regressions for the five crime rates, and separately using the divorce rate and the out-of-wedlock birth rate as their proxy for family structure. Across these various models, they find that:

...the coefficients of Div and Owed are significantly positive in seven of the ten cases, where two of the insignificant coefficients pertain to homicide.

In other words, higher divorce rates and out-of-wedlock birth rates cause an increase in crime rates. The size of the coefficients is a little difficult to interpret, given that what Errol et al. report is a coefficient that sums several lagged values. However, they do seem to be meaningful in size, given that Errol et al. note for the ordinary least squares (not IV) regression, that:

Based on the coefficients of the 10-year first difference estimates of Div (Owed), a one standard deviation increase in Div (Owed), is associated with a 15.4(1.8) and 174.4(153.3) percentage point increase in the rates of violent crime and property crime, respectively...

So, this paper demonstrates the important of social disorganisation in understanding differences in crime rates over an extremely long time period. However, it also illustrates a potentially fruitful way of constructing instruments for instrumental variables analyses, using cultural (or, in their case, linguistic) distance weighted averages. Expect to see more work in the future employing this approach.

Monday, 27 June 2022

When everything's a crisis, is anything really a crisis?

This Duncan Garner article in the National Business Review (gated) made me laugh, because it raised a point I have made many times (albeit, not on my blog):

The word crisis used to mean something. 

Now it seems everything is in a crisis. The latest crisis came just before our Matariki long weekend – a power shortage crisis. Maybe the Matariki stars will be our only form of light as now this country can't guarantee much at all, the latest being power.  

Why and how did this happen? And where's the accountability? The Minister overseeing this problem a year ago should walk the plank for failing to deliver a year later. That would be genuine accountability wouldn't it? 

So what's happening right now? Is it just us or is everything melting down? Look around – isn't everything a crisis or in crisis? 

There's the housing crisis, affordability crisis, cost of living crisis, mental health crisis, climate change crisis, health crisis, nursing shortage crisis, suicide crisis, water crisis, youth crisis, elderly abuse crisis, building supply crisis, Gib crisis. Did I mention the consumer confidence crisis

The obesity crisis? 

The manufacturing crisis?

Garner also missed the inequality crisis, the child poverty crisis, the supply chain crisis, the global data privacy crisis, and the Christmas toy crisis. And a few minutes on any search engine would no doubt turn up a dozen or more other crises.

Now, by definition, a crisis is: "a time of intense difficulty or danger". That may be true of most of the crises listed above (ok, maybe not the Christmas toy crisis, unless disappointed children and parents get really upset). However, the problem is that once you start to label everything as a crisis, the word starts to lose all meaning. In the good old days, a crisis really meant something. We had to act immediately, or face impending doom. Now, it just means things like a shortage of flowers on Valentine's Day. Yawn.

The problem here is that people's attention is scarce. People are not going to worry about the "pet adoption issue", or the "pet adoption problem", or even the "per adoption predicament". If you want rational people's attention, then it has to be a pet adoption crisis, or the perceived benefits of them paying attention won't outweigh the costs, and your issue will be lost in the general hubbub of people's social media feeds. However, if the incentive is to call everything a crisis, then there is no way for people to filter the really serious crises out from the noise, and they will stop paying attention.

Maybe, like Stephen Hickson's suggestion for a woolly words trading scheme, we need a more-specific crisis trading scheme, which caps the use of the word 'crisis'? We need this policy now, because we are in the middle of a crisis crisis.

Saturday, 25 June 2022

More on teaching evaluations and grade inflation

My Study Leave period is now over, and I've been preparing for my B Trimester teaching. At this time, many teachers would go back over past student evaluations, to determine what things they need to change or update in their teaching (instead, I tend to keep a detailed list of changes from the end of the previous teaching period, e.g. my current list of changes for ECON102 has about 90 minor tweaks to various topics, often incorporating new examples or new ways of organising or explaining the ideas.). So, student evaluations could in theory be a useful tool for improving teaching. However, as I've outlined in many past posts (for example, see this post and the links at the end of it), student evaluations of teaching (SETs) have a number of serious problems, particularly gender bias.

However, student evaluations also create incentive problems. Students tend to evaluate courses higher if they get a better grade. Teachers know that student evaluations affect their chance of promotion or advancement (even if research performance is considered more important by universities). So, teachers have an incentive to give students higher grades, and in turn receive better teaching evaluations as a result. I've posted on this incentive effect before.

Some further evidence of the relationship between grades and student evaluations is provide in this 2008 article by Laura Langbein (American University), published in the journal Economics of Education Review (ungated earlier version here). Langbein uses data from over 7600 courses taught at American University over the period from 2000 to 2003. She notes that even over that period, grade inflation is apparent in the data, as:

...the mean grade in 100-level courses increased from 3.1 to 3.2; the percent who earn less than a B in these courses dropped from 28% to 25%. For 200-level courses, the mean grade increased from 3.1 in Fall 2000 to 3.2 in Spring 2003, and the percent earning less than a B dropped from 29% to 24%. For 300- level courses, the mean grade remained unchanged at 3.3, but the percent below B dropped from 22% to 18%, and the median grade increased from B+ to A-. Among higher level classes, no such clear pattern of aggregate grade inflation is apparent.

Langbein then shows that students' actual grades and expected grades are both correlated with the students' evaluation of teaching, finding that:

...that the impact of a unit increase in the expected grade (say, from B to A, which contains most of the observations) would raise the instructor’s rating by an average of nearly 0.6 on a 6-point scale...

...a one-point increase in the average actual grade (say, from B to A, which is also where the observations lie) raises the SET by about 0.1 point on a 6-point scale...

Those results in themselves aren't necessarily a cause for concern. If the teaching is good, students should learn more (better actual and expected grades), and should evaluate the teaching higher. On the other hand, it could arise because when students are receiving and expecting a higher grade, they may 'reward' the teacher with a higher evaluation of their teaching, regardless of the actual quality of the teaching.

To try and disentangle those two competing explanations, Langbein applies a Hausman endogeneity test. Essentially, she tests whether there is reverse causality between the actual grade in a course and the SET - that is, whether the higher SETs cause higher grades (as well as higher grades causing higher SETs). She finds that:

While the results... clearly uphold the conclusion that faculty are rewarded with higher SETs if they reward students with higher grades, the sign of the residual variable depends on the specification of the endogeneity test. Under one specification, the sign of the residual is negative; under the other, it is positive. Consequently, the results give no clear indication about whether some component of the SET is a measure of ‘‘good’’ teaching and more learning, or easy class content and less learning.

So, while higher grades to lead to higher SETs, Langbein isn't able to definitively tell us why, or whether SETs are a good measure of teaching quality. However, from other research we already have good reason to doubt SETs, due to biasedness. Add this research to the (growing) list of papers that suggest that student evaluations can't tell us anything meaningful about teaching quality, and instead simply create incentives for grade inflation.

Read more:

Friday, 24 June 2022

Narrowing down on the source of the beauty premium

I've written a number of posts about the beauty premium (see the links at the bottom of this post) - that more attractive people are paid more, on average. There is robust evidence for this beauty premium across many labour markets, and for both genders and different ethnic groups. However, evidence for the specific mechanism underlying the beauty premium remains elusive. There are a few different theoretical reasons that beauty premiums may arise. First, employers may engage in taste-based discrimination - they like attractive employees, so they pay them more. If that were the case, we would see beauty premiums in all jobs. On the other hand, perhaps attractive people are more productive (in the sense that they generate more value for employers). This might be the case for customer-facing roles, for example. If that were the case, then we would see beauty premiums in jobs that require more human interaction, but no premiums in jobs with less human interaction.

That is essentially the test undertaken in this 2019 article by Ralph Stinebrickner (Berea College), Todd Stinebrickner (University of Western Ontario), and Paul Sullivan (American University), published in the journal Review of Economics and Statistics (ungated earlier version here). They use data from the Berea Panel Study, which included just over 500 students who first enrolled in Berea College in 2000 or 2001, and surveyed them each year after graduation (for up to ten years for some students). Because they can link the survey data back to students' characteristics during study, including their student ID picture, they are able to measure both labour market outcomes and attractiveness. Importantly, the surveys asked about the specific tasks that the graduates are undertaking in their jobs, allowing Stinebrickner et al. to classify jobs based on the actual tasks that are undertaken, rather than based on job titles. This reduces the measurement error from misclassifying whether the graduates are in people-centred jobs or not. Stinebrickner et al. classify all jobs into:

...four groups on the basis of jobs’ primary tasks: two groups where the primary task involves interpersonal interactions (high-skilled People and low-skilled People) and two groups where the primary task does not involve interpersonal interactions (high-skilled Information and low-skilled Information).

Each graduate's attractiveness was based on the average of 50 evaluators' ratings of their student ID picture, rated on a scale of 1 (significantly below average) to 5 (very attractive). They then run a fairly standard regression of wages on attractiveness, controlling for college GPA (which essentially accounts for differences in ability and motivation). Stinebrickner et al. focus most of the analysis on the sample of female graduates (because the sample size is much larger than for male graduates). For women, they find that there is:

...a large, statistically significant coefficient on the attractiveness measure. Specifically, increasing attractiveness by 1 sample standard deviation (0.78 on the 5-point scale) is associated with a 7.8% increase in wages...

The corresponding estimate for men is a 6.8% increase for each standard deviation higher attractiveness. Those results are consistent with the other literature on the beauty premium. However, of more interest are the results based on the different job classifications, where Stinebrickner et al. find that:

Providing very strong evidence that the attractiveness premium should not be attributed to an employer taste-based explanation, the results show that attractiveness has a strong effect on wages in jobs that specialize in People tasks but not in jobs that specialize in Information tasks. Specifically, a 1 standard deviation increase in attractiveness leads to a 9.7% wage increase in high-skilled People jobs... and a 9.3% wage increase in low-skilled people jobs (column 2)... In sharp contrast to the large wage premiums in People jobs... [the results] show no evidence of an attractiveness premium in Information jobs. Specifically, the coefficients on attractiveness for high- and low-skilled Information jobs are 0.011 and −0.039, and neither parameter is statistically significant...

Stinebrickner et al. also find that attractive people tend to sort themselves into people-related jobs:

The estimates of the marginal effects... show that increasing attractiveness by 1 standard deviation increases the probability of having a primary task of People by 0.053...

...these estimates indicate that attractive individuals are more likely to choose to work in both high-skilled People jobs and in low-skilled People jobs than in low-skilled Information jobs.

As noted above, if the beauty premium arises because of employers' taste based discrimination, then there would be a beauty premium for all jobs. But Stinebrickner et al. only find a beauty premium for jobs that involve mostly interpersonal interactions, and not for jobs that are mostly information-related. That is consistent with a productivity explanation. However, Stinebrickner et al. sound a note of caution:

While we tend to refer to the alternative to this explanation as a productivity-based explanation, it is worth stressing that it is difficult, even from a conceptual standpoint, to distinguish between a productivity based explanation and a customer taste-based discrimination explanation. To help fix ideas, consider a standard textbook-type example in which a customer is willing to pay more to interact with an attractive server in a restaurant. This preference might be viewed as productivity based if the attractiveness leads to more efficient employee-customer interactions that help a customer arrive at the best possible food order. Or this preference might be viewed as customer taste-based discrimination if attractiveness does not influence the customer’s order, but the customer simply enjoys looking at a more attractive employee.

The existence of only fairly nuanced differences between the two scenarios in the example highlights why it will always be difficult to conclusively distinguish between the customer discrimination and productivity-based explanations...

So, Stinebrickner et al. have provided us with a bit more detail on the source of the beauty premium. It arises either from higher productivity of more attractive workers in people-facing roles, or from customer taste-based discrimination. Unfortunately, because we can't disentangle those two explanations, we are left without an answer to the important policy question of whether the beauty premium leads to inefficiency (as it would if based on customer taste-based discrimination) or not, and so there is no strong evidence to favour any intervention in the market to limit or reduce the premium.

Read more:

Thursday, 23 June 2022

Why price controls likely make things worse, not better

Phil Lewis wrote a good article on The Conversation today, about price controls:

Australian shoppers are facing a crisis in the fresh-food aisles.

Iceberg lettuces that cost $2.80 a year ago have doubled, or tripled, in price. Brussel sprouts that cost $4 to $6 a kilogram are now $7 to $14. Beans that cost $5 to $6 a kilogram are now more than double – and five times as much in remote areas...

The price hikes have led to calls for supermarkets to impose price caps to ensure shoppers can still afford to feed their families healthy food.

But price ceilings on goods or services rarely, if ever, work. Prices play an important role in allocating resources efficiently. They send a signal to both customers and suppliers. To arbitrarily reduce prices would only increase shortages – both now and in the longer term...

Higher prices provide a signal both to consumers and producers. They tell consumers to buy less and switch to alternatives. They provide an incentive for producers to grow more – though this process is fairly slow given the time needed to grow and harvest fruit and vegetables.

But eventually, if the market is left to its own devices, prices will eventually return to “normal”, consistent with historical prices.

Capping the price, on the other hand, will benefit those lucky enough to grab supplies when they available. But it will likely reduce supply even further, by affecting the decision of producers unwilling to supply at below-market prices.

It could also lead to a “black market”, with some customers sourcing supplies by other means at higher uncapped prices...

So generally price caps are to be avoided.

Now, if anything, Lewis understates the case against price controls (specifically, price ceilings - a legal maximum price which the market price is not allowed to exceed). Price ceilings are effective in lowering the price, but with a lower price, consumers want to buy more (this is the 'Law of Demand'). However, there isn't more to go around (if anything, the lower price reduces the incentive for sellers to supply the good. So, you have more consumers wanted to buy a restricted quantity of the good - it creates a shortage.

Shortages mean that the limited quantity available must be rationed in some way among the many consumers who want to buy at the low price. Usually, price is the main rationing mechanism in the market (only consumers who are willing and able to pay the market price will buy the good). However, when there is a price ceiling keeping the price artificially low, then some form of non-price rationing, is going to have to occur. Perhaps this rationing is based on who can get to the store first in the morning when the new stock is available, or is lucky enough to be at the store when shelves are re-stocked. Perhaps consumers have to queue in order to avoid missing out. Perhaps retailers have a lottery. Perhaps there is a rationing system where consumers are limited in the quantity they are allowed to buy. Perhaps interested consumers sign up and receive tickets that guarantee them a small amount of the good.

Notice how all of these non-price rationing alternatives do one of two things: (1) they impose a direct (non-monetary) cost on consumers (such as the time cost of queueing); or (2) they involve an element of luck. Higher non-monetary costs simply undo a lot of the good that the price ceiling was intended to create. Getting a good that you want only because you were lucky in a lottery, or happened to be in-store when shelves were re-stocked, is in my view not a particularly fair allocation system.

These non-price rationing schemes are also open to abuse, by consumers who are lucky (or who are happy to face the non-monetary cost) on-selling the goods to other consumers who are willing to pay more. This is the black market that Lewis refers to. And higher black market prices than the controlled price provide an incentive for unscrupulous sellers to ensure that their friends receive the goods in the lottery, or just happen to be in store at the right time. This sort of corruption is simply not worthwhile if there is no price ceiling in place.

For some graphic examples of how price ceilings can go wrong, look no further than rent controls (see some of my posts on that here, here, here, and here). The short version is that rent controls reduce economic welfare, they reduce the quality of housing available to rent, and they may even increase inequality. And, their effects get worse over time. In a famous quote, the Swedish economist Assar Lindbeck (who passed away in 2020) wrote that “Rent control appears to be the most efficient technique presently known to destroy a city - except for bombing.”

So, even though some consumers will certainly benefit from the lower prices that price controls create, we must never lose sight of the fact that they don't come with significant negative consequences as well. 

Wednesday, 22 June 2022

Student preferences for online vs. in-person education

I've posted a couple of times before about research on student preferences for online or in-person education (see here and here). The takeaway from the two research papers I referenced in those posts was that on average students had a preference for in-person classes, because they were willing to pay less for online options, but a substantial minority of students do prefer online classes. A new article by Lauren Steimle (Georgia Institute of Technology) and co-authors, published in the journal Socio-Economic Planning Sciences (ungated earlier version here), adds a bit more detail to our understanding of these varying student preferences.

Steimle et al. collected survey data in June and July 2020 from 398 Georgia Tech industrial engineering students across all levels of undergraduate study. Importantly, their survey included a discrete choice experiment (DCE), which allows them to evaluate the willingness-to-pay of students for different characteristics of a return to study in the Fall semester of 2020. The characteristics (referred to as attributes) that Steimle et al. investigated:

...were Mode of Course Delivery, Safety on Campus, Residence Hall Operating Capacity, Tuition Reduction, and Limits on Events and Social Gatherings. Each of these attributes was assigned 4 levels based on the latest recommendations from the Centers for Disease Control and Prevention’s interim guidance to institutions of higher education to prepare for COVID-19...

Included the tuition reduction attribute allows Steimle et al. to measure the intensity of students' preferences for the other attributes in terms of how much percentage equivalent tuition reduction they are worth (which they refer to as tuition percentage point equivalent, or TPPE). Before getting to analysing the DCE, Steimle et al. look at the factors associated with students preferring online rather than in-person courses generally, finding that:

i. Students with greater current concern level are more likely to choose online courses than students with lower concern levels,

ii. Students with higher perceived risk of infection are more likely to choose online courses than those with lower perceived risk of infection,

iii. Students with better current living suitability for online courses are more likely to choose online courses than students with worse current living suitability for online courses,

iv. Students who are more risk-seeking are more likely to choose in-person courses than students who are less risk-seeking, and

v. Younger students are more likely to choose in-person courses than older students.

For the DCE, Steimle et al. employ a latent class analysis, which means that the research participants are first sorted into groups (classes), where each class has different preferences than the other classes. In this case, the optimal solution has three classes (although, looking at their diagnostics, I'm a little confused as to why they didn't choose a four-class approach, because it does appear that four classes fits the data better). The three classes differ most in terms of how concerned the students are about the pandemic, labelling the three classes 'low-concern', 'moderate-concern', and 'high-concern', where:

...29% of students fell into Class 1, being not-so-concerned, while 17% fell into Class 3, being highly concerned. Most of the students were in the “moderate-concern” class (i.e., there was an average 54% probability of belonging to Class 2).

Looking at the types of students in each class, Steimle et al. note that:

...students in the “moderate-concern” class or “high-concern” class had slightly better current living suitability for online courses... and were more likely to live off campus in Fall 2020...

Asian/Pacific Islander students were more likely to be in Class 2 or Class 3, while White/Caucasian students were more likely to be in Class 1... Students in the “moderate-concern” class or “high-concern” class tended to politically lean Democrat and had a more liberal world view. “Low-concern” students were inclined to lean Republican and had a more conservative world view. We did not observe substantial differences in the share of those on financial aid among the three classes.

In terms of the DCE results (analysed separately for each class of students), Steimle et al. find that:

...students in more concerned segments (i.e., Class 2 and Class 3) placed more importance on modes incorporating at least some online courses. However, students in Class 1 did not like having online courses compared to completely in-person courses. Students in the “high-concern” segment (Class 3) put much more importance on entirely online courses when deciding to enroll compared to students in the “moderate-concern” segment (Class 2), indicated by the result that the TPPE of “All courses delivered entirely online” in Class 3 was more than twice as large as the corresponding TPPE in Class 2.

Not too surprising there. In general, the high-concern students tend to prefer online classes, while the low-concern students tend to prefer in-person classes. However, that doesn't provide too much help to universities trying to decide what mix of attributes would best ensure high enrolments. Fortunately, Steimle et al. outline some scenarios based on different combinations of attributes:

The first is a “business-as-usual” scenario in which courses are delivered entirely in-person, no requirement on mask-wearing and no testing, 100% operating capacity for residence halls, full tuition, and no limit on the size of social gatherings. Under a “business-as-usual” scenario, the low-concern class (Class 1) was predicted to enroll with 98.9% probability and the “moderate-concern” class (Class 2) was predicted to enroll with 94.8% probability. However, the “high-concern” class (Class 3) was predicted to enroll with only 17.0% probability. Weighting by the class membership probabilities, the weighted average enrollment probability is 82.8% for this “business-as-usual” scenario. In contrast, another tested scenario is a “completely online” scenario in which a 5% tuition reduction is given and has weighted average enrollment probability of 94.6% with the low-concern class enrolling with 85.4% probability, the moderate-concern class enrolling with 99.7% probability, and the high-concern class enrolling with 93.9% probability. A scenario with a higher enrollment probability is a “strict on-campus hybrid” scenario in which large courses are delivered online with small courses delivered in-person, required mask-wearing and extensive testing, residence halls are at 25% capacity (in which there are no roommates and no shared bathrooms), no tuition reduction, and a limit of 20 people at social gatherings. This “strict on-campus hybrid” scenario has a weighted average enrollment probability of 97.6% because it broadly appeals to students from the different classes: low-concern class is expected to enroll with 94.7% probability, the moderate-concern class has a near 100% probability of enrolling, and the high-concern class has a 95.1% probability of enrolling.

All of this seems to accord with the results in the earlier studies, but with a bit more detail in terms of analysis. However, it pays to bear in mind that this study was limited to industrial engineering students, who might have greater preferences for in-person education due to the need for in-person labs. On the other hand, students with greater concern for their health might have been more likely to respond to the survey (although the latent class analysis segregated them into their own analysis, the proportions of students of each type would be biased, and therefore so would the enrolment scenarios that Steimle et al. presented).

Finally, given that the survey was undertaken while there was still great uncertainty over the state of the pandemic and future risk, we probably can't extrapolate from it to understanding student preferences for online or in-person classes outside of pandemic times. For that, we would need to replicate the study at a time when in-person classes are less inherently risky for students.

Read more:

Tuesday, 21 June 2022

Income inequality and economic growth in Australia

The relationship between income inequality and economic growth is theoretically ambiguous. You could argue that income inequality should increase economic growth, because: (1) inequality means that there are more people with high incomes, higher income people save more, and more savings increases funds available for investment spending, which increases productivity and economic growth; or (2) inequality increases the incentives for people to work harder and get ahead, increasing productivity and economic growth. On the other hand, income inequality could decrease economic growth, because: (1) inequality creates incentives for people to engage in rent-seeking behaviour, such as buying political favours, which is wasteful of resources; (2) people dislike inequality, so when inequality is high, they pressure the government to engage in redistribution, which reduces work incentives, productivity, and economic growth. So, given the theoretical ambiguity, the only way to establish this relationship is empirically, using data.

That is what this 2017 article by Tom Kennedy (University of New England), Russell Smyth (Monash University), Abbas Valadkhani (Swinburne University of Technology), and George Chen (University of New England), published in the journal Economic Modelling (ungated earlier version here), attempts to do, using Australian data. Specifically, Kennedy et al. construct time series of inequality and economic growth at the state (and territory) level for Australia over the period from 1986 to 2013. They then apply a fairly straightforward panel regression analysis, controlling for state-level investments in physical and human capital, and find that (in their Model 1):

...the effect of inequality on growth is negative and highly significant at the 1% level, suggesting that falling income inequality can substantially boost economic growth. On average, an additional 10% rise in the growth of inequality can bring about a 2.55% fall in real output growth...

A second model, employing a somewhat different specification, results in substantially similar results. However, they look at both contemporaneous and lagged effects of each variable on economic growth. On the lagged effects, Kennedy et al. conclude that:

While policies aimed at increasing physical capital can immediately boost economic growth, the impact of a rise in human capital, or a fall in inequality, on the Australian economy appear to be statistically significant, but with one and two years delay, respectively...

These results should suggest to us that, of the competing theoretical mechanisms linking inequality and economic growth, the negative effects on balance appear to outweigh the positive effects. However, Kennedy et al. don't provide any results that might tell us which mechanisms are at play (or indeed whether there are other mechanisms we haven't thought of that drive this observed correlation). Their analysis also falls short of demonstrating a causal relationship from inequality to lower economic growth. For a better understanding of the causal relationship and underlying mechanisms, we are going to need more research in the future.

Sunday, 19 June 2022

Socioeconomic status and the health benefits of moderate drinking among older people

A common finding in studies of the relationship between alcohol use and health is a J-curve relationship - moderate drinkers have better health than abstainers, and better health than heavy drinkers. However, there are a lot of confounding factors in the observed relationship between drinking and health. So much so that it is difficult (perhaps impossible) to establish whether there is a causal relationship between them.

One confounding factor is socioeconomic status, and that is the focus of this 2018 article by Andy Towers, Michael Philipp (both Massey University), Patrick Dulin (University of Alaska), and Joanne Allen (Massey University), published in the Journals of Gerontology: Social Sciences (open access). They used data from the New Zealand Health, Work and Retirement Study 2012 wave, which included nearly 3000 participants aged 52-86 years, and look at how the relationship between daily average alcohol consumption and self-reported health are related, after controlling for socioeconomic status. Their measure of health status is the widely-used SF-12 physical health score, and their measure of socioeconomic status is the ELSI-SF, used by the Ministry of Social Development, as well as income and education.

Figure 1 in the paper illustrates why controlling for socioeconomic status is probably pretty important when looking at the relationship between alcohol consumption and physical health (for women; there is a similar figure in the article that shows the relationship for men):

Notice that the measure of physical health varies over different levels of drinking (LA is lifetime abstainers; and CND is current non-drinkers) in a very similar way to the measure of socioeconomic status (see also this earlier post on the relationship between drinking and earnings). When Towers et al. include both drinking and socioeconomic status in a regression model of physical health, they find that:

For older men, the quadratic relationship between alcohol and health is evident when controlling for SES proxies but disappears when SES is controlled for. For women, the direct linear relationship between alcohol and health exists after controlling for SES proxies and is substantially reduced - though still evident - when controlling for SES.

This probably overstates the case (although not to the same extent as the title of the article does). While drinking becomes statistically insignificant in the model of physical health for men, the point estimates on the coefficients for alcohol fall by roughly two-thirds. So, it would be better to say that socioeconomic status explains about two-thirds of the relationship between drinking and physical health for men. For women, socioeconomic status explains about forty percent of the relationship, which remains statistically significant after socioeconomic status is included.

Of course, none of this is to say that the problem of causality is solved. The models that Towers et al. use are still cross-sectional and only show correlations, and they are up-front about that in the conclusion to their article:

However, while our study does not provide evidence for causal relationships among the variables it is logical to suggest that, rather than alcohol having a beneficial influence on the health of older adults, higher SES levels may simply facilitate lifestyles that enable both regular moderate drinking and the capacity for better health.

There are other limitations as well, not least of which is that current drinking is being linked to current physical health, but current health probably depends on many years of drinking behaviour rather than just current drinking (which is why we need longitudinal studies, and is why it is appropriate that Towers et al. drop current non-drinkers from their analysis sample). However, this study does illustrate the confounding effect that socioeconomic status exerts when applying simple methods to compare physical health between people at different levels of drinking. If you didn't fully consider socioeconomic status, you could easily overstate the positive effects of moderate drinking.

Friday, 17 June 2022

A counter-point to Wednesday's post on cultural differences and the gender wage gap

On Wednesday, I wrote a post about how much of the gender wage gap depends on cultural differences. It turns out that, at least in the market for top executives (and especially CEOs), cultural differences explain a large part of the gender wage gap that remains after you control for age, tenure, job title, year, and industry, as well as a number of firm characteristics. However, the underlying idea that cultural differences matter is questioned in this 2016 working paper by Charlotta Stern (Stockholm University), which was subsequently published in 2017 as a book chapter here.

Stern's target is really the sociological study of gender in the labour market, but I suspect that she sees the underlying argument as applying more broadly. In particular, she critiques sociologists' preference for explaining differences in labour market outcomes between men and women as arising from cultural differences, rather than innate biological differences or differences in preferences. Stern concludes that:

The left-feminists’ domination of gender sociology has resulted in a strong norm to explain differences between men and women only in terms of culture, broadly defined, and to ignore or gloss over biological or preference explanations, and hence to interpret differences in outcome as resulting from socialization into gender roles or to discrimination of various sorts. The taboo is kept in place by a groupthink mentality where it seems scholars fear that even a slight dissension from the constructivist view would cause expulsion and charges of anti-feminism...

Applying Stern's argument to the cultural differences paper I discussed on Wednesday would suggest that there are important omitted variables in the analysis of gender differences in wages, being biological differences, and differences in preferences. However, both of these omitted variables are problematic (or at best unhelpful) for developing an understanding of gender differences in wages.

The biological differences argument is problematic in a pragmatic sense, because gender differences in wages are tautologically entirely explained by biological differences (if we conveniently ignore the social construction of gender). So, there would be no statistical way to feasibly disentangle any other contributors to gender differences in wages, after controlling for biological differences. Gender differences in wages would be entirely explained by biological differences.

Stern's argument in favour of considering preferences as separate from cultural differences is problematic for a more theoretical reason. Preferences are themselves socially constructed. Stern argues that there are evolutionary differences in preferences, but the evidence in favour of that assertion is relatively weak. It is difficult to take seriously a theory where the implication is that a significant proportion of people's preferences being fixed and determined at birth. And it would fly in the face of real-world experience, where our preferences for things (including occupational and work preferences) change over time, in response to external influences such as our family situation, our peers, and the prevailing norms of the society we live in. That is, our preferences are determined by culture. So, accounting for differences in preferences between genders would simply be accounting for cultural differences in a different guise.

Neither biological differences nor differences in innate preferences offers a policy solution to gender wage differentials. Therefore, one implication of Stern's critique is that, because such differentials are fully explained by factors that are not amenable to change, they not worth troubling ourselves over unless they offend us ideologically. I'm not usually in favour of staking out strong ideological positions (see here for more of my thoughts on ideology), but in this case, I feel like I would be on the right side.

[HT: Marginal Revolution, back in 2018 (the paper sat in my to-be-read pile for a while!)]

Wednesday, 15 June 2022

Cultural differences and the gender wage gap

The gender wage gap differs substantially across countries. For instance, here's the gender wage gap (as a percentage of the male wage) across OECD countries for the latest available year (from here):

It's not easy to see (go to the source for a bigger version), but the blue column is New Zealand (4.6 percent), the black is the OECD average (11.6 percent), the red is Australia (12.3 percent), and the purple is the U.S. (17.7 percent). The column at the far right, at a whopping 31.5 percent, is South Korea. But what explains these differences? Some of the difference no doubt arises from policy differences, but how much of it is cultural differences?

That is the question that is addressed in this new NBER Working Paper (sorry, I don't see an ungated version online) by Natasha Burns (University of Texas at San Antonio), Kristina Minnick (Bentley University), Jeffry Netter (University of Georgia), and Laura Starks (University of Texas at Austin). They look specifically at the gender wage gap for corporate executives, reasoning that:

...studying compensation at the executive level gives us the advantage of a particularly competitive market (due to the level of compensation and the frequent use of consultants) which should result in lower gender compensation gaps...

The competitive nature of the labour market for corporate executives should also abstract away some of the inefficiencies that are more likely to be present in other labour markets. Burns et al. use data on executive compensation from Standard and Poor's Capital IQ. Their data covers 31 countries over the period from 2004 to 2016. For cultural differences, they construct measures using data from the World Values Survey. As they explain:

We first consider a group of WVS questions directly related to values, attitudes, and beliefs regarding women, specifically, questions covering women’s entitlement to education, the role of women in society, and leadership abilities of men versus women...

We also employ a group of WVS questions that we expect to be related to values, attitudes, and beliefs regarding women. These questions focus on religion, the acceptance of violence toward women, and intolerance in the society...

We also employ a measure of the justification of violence towards women and children as an additional way to capture cultural attitudes toward women...

Finally, we use a group of variables that more generally capture a society’s values, attitudes, and beliefs about work, success, markets and ethics, which should reflect views toward executive compensation in general, as these variables have been employed in previous research.

That gives Burns et al. a battery of different measures of the dimensions of culture (specifically limited to dimensions related to gender attitudes and work attitudes), which they reduce further to three factors using principal component analysis. Those three factors are labelled F1 (Religion, Violence, Intolerance, and Corruption), F2 (Gender education and Gender work), and F3 (Hard work, Individualism, and Trust).

Turning to their results, Burns et al. first show that:

...there exist significant gender gaps in compensation for the top executives. Across all countries the average compensation for male CEOs is $1.81mm compared to $1.41mm for female CEOs.

There is also a gender pay gap for other executives at levels lower than CEOs, but the gap is somewhat smaller. The gap also varies significantly between countries. To explain the differences, they first regress executive wages on gender, controlling for age, tenure, job title, year, and industry, as well as a number of firm characteristics. They find that:

...the average gender gap is not small as the coefficient shows an average gender pay gap of 16.6% across all countries and executive positions.

Next, adding cultural variables into their models, they find:

...executive compensation to be significantly related to each of these cultural variables on its own and when interacted with the female gender indicator...

Specifically, in relation to the three cultural factors:

...executive compensation loads negatively on factor F1 (Religion, Intolerance, Violence, Corruption). Thus, in societies in which these cultural beliefs and attitudes are more prevalent, the results suggest lower compensation for all executives. In addition, the interaction term between F1 and gender suggests that the reduction in compensation is even greater for women executives in the countries that rank higher in these dimensions. The comparison between the F1 results and the Model 2 F2 (Gender_education and Gender_work) results or the Model 3 F3 (Hardwork, Individualism and Trust) results are striking. The level of executive compensation in general is positively related, and the gender gap is negatively related, to the cultural norms reflected in F2 and F3. These results support the hypotheses that overall compensation is greater and the gender gap is smaller in societies that believe women are entitled to equal education, that value women’s roles in the workplace, hard work and individualism, and where trust is higher.

So, how much of the gender wage gap for executives is explained by culture? Burns et al. employ an Oaxaca-Blinder decomposition, which shows that:

...when we use the three factors from the factor analysis, the unexplained portions reduce from 58% to 7.4% for CEOs, 48% to 21.7% for the Top 3 Executives, and 49.2% to 39.1% for the Other Executives.

That means that the control variables explain 42 percent of the variation in CEO wages (leaving 58 percent unexplained), and the cultural variables (F1, F2, and F3) reduce the unexplained fraction to 7.4 percent. Cultural differences explain around half (50.6 percent) of the gender wage gap for CEOs. The corresponding fraction for top three executives (president, COO, and CFO) is 27.3 percent, and 10.1 percent for other executives.

What is interesting is that more of the gender pay gap is explained by cultural differences for CEOs than for less highly ranked executives. Burns et al. don't really offer much of an explanation as to why that might be. It would be interesting to see if extends down to middle management and below (by which time, following the decreasing trend, cultural differences may not explain much if any of the gender wage gap).

It would be easy to take the results of this paper, and be a bit fatalistic about efforts to reduce the gender wage gap, because culture is not easy to change. However, Burns et al. provide some evidence that policy change can matter. They look at the gap before and after the introduction of paternity leave laws and board diversity laws, finding that the gender wage gap decreases significantly in each case (although, I think the results of those analyses seem implausibly large, and Burns et al. don't provide any discussion of the effect size, which suggests that they might also doubt the results).

Still, this is an interesting paper, which points to the importance of cultural differences in contributing to the gender wage gap. No doubt there is more research to come in this space.

[HT: Marginal Revolution]

Read more:

Tuesday, 14 June 2022

Is it time to reconsider nudge theory?

In a surprising recent working paper, Nick Chater (University of Warwick) and George Loewenstein (Carnegie Mellon University) outline a case against 'nudges' as a policy tool. As you may know, nudges take advantage of insights from behavioural science, psychology, and behaviour economics, to change individual behaviour for the better. This idea was popularised in Richard Thaler and Cass Sunstein's 2008 book Nudge. Chater and Loewenstein have been at the forefront of the nudge movement, as members of the advisory board of the U.K.'s Behavioural Insights Team (popularly known as the 'nudge unit').

What is surprising about the working paper is that this is a well-reasoned critique of using nudges to address policy issues, written by two nudge policy 'insiders'. Their main argument is best summarised in the long abstract to the paper:

An influential line of thinking in behavioral science, to which the two authors have long subscribed, is that many of society’s most pressing problems can be addressed cheaply and effectively at the level of the individual, without modifying the system in which individuals operate. Along with, we suspect, many colleagues in both academic and policy communities, we now believe this was a mistake. Results from such interventions have been disappointingly modest. But more importantly, they have guided many (though by no means all) behavioral scientists to frame policy problems in individual, not systemic, terms: to adopt what we call the “i-frame,” rather than the “s-frame.”

Chater and Loewenstein distinguish between 'i-frame' interventions and 's-frame' interventions throughout the paper. They explain the difference as:

The behavioral and brain sciences are primarily focused on what we will call the i-frame: that is, on individuals, and the neural and cognitive machinery that underpins their thoughts and behaviors. Public policy, by contrast, is typically focused on the s-frame: the system of rules, norms and institutions by which we live, typically seen as the natural domain of economists, sociologists, legal scholars and political scientists.

The difference is important, because:

Unlike traditional policies, i-frame interventions don’t fundamentally change the rules of the game, but make often subtle adjustments that promise to help cognitively frail individuals play the game better.

However, the problem that Chater and Loewenstein see is not limited to the modest effects of i-frame interventions when compared to potential alternative s-frame policy or institutional change. They note that:

We have begun to worry that seeing individual cognitive limitations as the source of problems may be analogous to seeing human physiological limitations as the key to the problems of malnutrition or lack of shelter. Humans are physiologically vulnerable to cold, malnutrition, disease, predation and violent conflict, and an i-frame perspective on these problems would focus on hints and tips to help individuals survive in a hostile world. But human progress has arisen through s-frame changes---the invention and sharing of technologies, economic institutions, legal and political systems, and much more, which created an intricate social, political and economic system that has led to spectacular improvements in the material dimensions of life. The physiology of individual humans has changed little over time and across societies; but the systems of rules we live by have changed immeasurably. Successful s-frame change has been transformative in overcoming our physiological frailties.

Chater and Loewenstein also worry that i-frame interventions get in the way of potentially successful s-frame changes:

There is, moreover, a more subtle way in which i-frame interventions undermine s-frame changes: through shifting standards of what counts as good quality evidence for public policy. For many i-frame policies, randomized controlled trials have been widely viewed as a gold-standard method for evaluating and incrementally improving policy... But the gold-standard of experimental testing provides a further push towards i-frame interventions (where different individuals may be randomly assigned distinct interventions) and away from s-frame interventions, where it is rarely possible to change the “system” for some subset of the population...

And that corporations have actively weaponised the i-frame to prevent s-frame solutions. In their view, this has played out following a common pattern:

1. Corporations with an interest in maintaining the status quo put out PR messages that the solution to a problem they are associated with lies with individual responsibility, and that people need to be helped to exercise that responsibility more effectively. That is, the challenge of fixing the social problem is cast in the i-frame...

2. Behavioral scientists enthusiastically engage with the i-frame...

3. There are hopes that proposed i-frame interventions (including nudges, and providing better individual-level incentives, information and education) might provide cheap and effective solutions to conventional s-frame policy levers, such as regulation and taxation. This hope distracts attention from the s-frame...

4. The i-frame interventions show at best modest, and often null, effects, and are sometimes even counterproductive...

5. Corporations themselves relentlessly target the s-frame, where they know the real leverage lies. They spend substantial resources on media campaigns, lobbying, funding think-tanks and sponsoring academic research, to ensure that the “rules of the game” reinforce the status quo.

Chater and Loewenstein outline a number of problems where i-frame interventions have been tested, but where s-frame policy or institutional change would be much more effective. This includes climate change (e.g. i-framed individual carbon footprint calculators vs. s-framed carbon taxes), obesity (i-framed motivational interventions, tray-less cafeterias, etc. vs. s-framed sugar taxes or regulations), retirement savings (e.g. i-framed individual retirement savings vs. s-framed universal retirement saving or public pensions), and plastic waste (e.g. i-framed individual responsibility for recycling vs. s-framed regulations banning single-use plastics). They also point in lesser detail to a number of other applications where i-frame thinking gets in the way of s-frame solutions, including healthcare reform, educational inequality, discrimination, online privacy, misinformation on social media, the opioid epidemic, and gun violence.

Not everyone will agree with Chater and Loewenstein's critiques. However, they are all the more forceful and worth paying attention to, having come from within the advocates of behavioural interventions. That makes this paper potentially much more consequential than previous libertarian critiques, such as those in the book Nudge Theory in Action (which I reviewed here). It will be interesting to see how the advocates of nudge theory respond.

[HT: Tim Harford]

Monday, 13 June 2022

Why study economics? Wizards of the Coast edition...

Have you ever wondered what happens in the labour market for wizards' henchmen when an evil wizard is defeated? Or stayed up a night worrying about inflation caused by adventurers looting dragon hoards then returning to the city with their newly-acquired riches? Have I got the job for you! Wizards of the Coast is hiring:

At Wizards of the Coast, we connect people around the world through play and imagination. From our genre defining games like Magic: The Gathering® and Dungeons & Dragons® to our growing multiverse, we continue to innovate and build new ways to foster friendship and connection. That’s where you come in!

Magic: The Gathering is a card game played and collected across the globe, with a wide-ranging assortment of products designed to engage a wide range of ways people enjoy playing Magic. As a Sr. Design Economist, you will help us better understand how Magic is played and purchased to help us make better, faster strategic decisions.

What You'll Do:

  • Learn from the Past: Study the data and trends to discover insights, new perspectives, and opportunities to improve how we serve different types of customers and markets.

  • Live in the Moment: Track and report on sales, identify market channels that are over/underperforming, and refine our projections and strategies in real time.

  • Predict the Future: Project product sales to inform print runs and market allocation for products we have made for decades, and to inform design of products we’ve never made before.

  • Boost our Agility: Help us adapt faster to changes in market conditions or behavior.\

  • Make our Party Smarter: Work with our design and sales teams to identify key holes in our understanding, conduct impactful studies, and communicate actionable insights. 

Ok, the successful applicant won't exactly be solving economic policy issues in fantasy worlds. But still, this is no doubt going to be a dream job for someone. And, let's face it, could there be a cooler application of economics than to gaming? This illustrates the underappreciated breadth of jobs that studying economics sets students up for (with many more in the associated links below).

[HT: Marginal Revolution]

Read more:

Sunday, 12 June 2022

Book review: Mission Economy

The world is simultaneously facing many challenges - climate change, inequality, population ageing, persistent gender gaps, the digital divide, transition to green energy, pandemic disease, and more. How are governments to best deal with these problems? In her recent book, Mission Economy, Mariana Mazzucato offers her view on a framework for developing solutions to society's wicked problems. Mazzucato, a professor in the Economics of Innovation and Public Value at University College London, argues for a mission-oriented approach. The exemplar that Mazzucato uses is the Apollo program which, in less than a decade, successfully launched astronauts to the moon and brought them back safely. Apollo focused resources on a single measurable goal, aligning public and private organisations towards a common goal.

I have a lot of sympathy for Mazzucato's proposal. On the whole, governments and the private sector are consistently failing to adequately tackle the modern world's key problems. It may require a radical readjustment in the overall approach, and that is what Mazzucato is offering. In the concluding section of the book, she offers seven key pillars for a new political economy to guide the mission-oriented approach:

  1. A new approach to value, with business, government, and civil society creating value together;
  2. A different framing of policy related to markets, moving from fixing market failures to actively co-creating and co-shaping markets;
  3. Organisational change, particularly for governments to reduce outsourcing and to share risk with the private sector;
  4. A change in the way that budgets and public financing are considered, moving from the economy serving finance, to finance serving the economy;
  5. Distribution and inclusive growth, focusing on predistribution and not just redistribution;
  6. A greater focus on partnership and stakeholder value; and
  7. Participation in the creation process, through a revival of debate, discussion, and consensus-building.

There is a lot that is arguable in the discussion of those pillars, but also a lot to like about the overall vision. However, despite the attractiveness of the ideas Mazzucato forwards in this book, there are several key problems that I felt were not adequately addressed. The first problem is how missions are chosen. Mazzucato devotes a fair amount of attention to this point, but her solution, based on democratic institutions, discussion, and consensus-building to me ignores the problems inherent in such an approach. The Apollo program worked well, but it was a mission that was chosen by President Kennedy in a top-down fashion. If, as Mazzucato contends, we are to build missions from the ground up, how are we to contend with competing preferences. Arrow's Impossibility Theorem reigns here. A system that results in multiple different concurrent missions simply risks competition for the same resources. Mazzucato refers to the Sustainable Development Goals, but there are 17 of them (with 169 targets) - this is the same problem that faces the 'beyond GDP' agenda (as I noted in my recent reviews of Measuring What Counts and For Good Measure, here and here). The Apollo program worked because there was a single mission to focus on, not 17 competing (albeit somewhat complementary) missions. The democratic approach to choosing a mission is also fraught, likely to lead to missions that suit the majority and ignore or pay minimal lip service to minority preferences. Mazzucato makes reference to Gil Scott-Heron's poem "Whitey on the Moon" (a stinging critique of the focus on the Apollo programme, while many of society's problems, and especially those relevant to African Americans, remained unsolved). However, Mazzucato's book lacks the necessary reflection that the mission-oriented approach needs to find some way to avoid the same problems that Scott-Heron refers to.

Mazzucato provides a critique of capitalism in her book, and argues that we need something better. However, the best mission-oriented economies are command economies. If you want to radically change economic and political institutions, or build infrastructure with minimal delays, having a completely centralised economy is one way to achieve it.

Second, and relatedly, the Apollo program worked well when there was one true source of information. The government could effectively control the narrative, and didn't have to worry too much about sniping from the sidelines. When social media gives everyone a platform to openly critique the government, it is difficult to maintain a focus on a single project, especially where there is always a chance that something is not going perfectly to plan. We've actually seen this in action in a way that is far too close for comfort. You could argue that New Zealand's response to the pandemic was mission-oriented. The mission was to eliminate the virus and prevent COVID-related deaths. Society was mostly on board with this mission, but not everyone. And every time that some part or other of the mission wasn't working perfectly, everyone was a critic. It would be even worse in a more politically polarised country than New Zealand. Mazzucato doesn't engage with this problem at all. It's not clear that she would want to, since 'controlling the narrative' in the modern sense is a tool employed by the most controlling autocratic regimes.

Finally, and potentially the biggest problem, is that there is a very real difference between a mission that is solving what is almost entirely an engineering, physics, and mathematics problem, and a mission that is trying to solve a human problem. To be fair, Mazzucato is aware of this point, writing that:

All this brings us back to the point that social missions are harder to fulfil than purely technological ones because they combine political, regulatory and behavioural changes.

That may be the understatement that defines this book. People are not simply chess pieces to be moved on a societal board. They have their own motivations and desires. Even if you can align them with a single goal in mind, people don't always do what you want. Solving human problems is hard. If it was easy, we would have done it already. If it was as easy as solving an engineering problem, then the most successful countries would be run by engineers. And excluding China (Xi Jinping trained as a chemical engineer), they're not.

The world has changed since the Apollo program was run. Small problems can quickly derail any large programme or initiative. Mazzucato refers to the Apollo I disaster in 1967, where three astronauts were burned alive during a dry run (they weren't even going to space yet at that stage). After six months of review, the programme was back on track. Contrast that with the Challenger and Columbia shuttle disasters, both of which grounded the space shuttles for two and a half years (and after Columbia, the programme was soon shut down). Serious mistakes early in a programme are likely to see the programme shut down. That makes the people in authority more risk averse, which is contrary to a radical, mission-oriented approach.

Having a mission-oriented economy is a bold vision, and an attractive one. We could start today. Some countries and governments have already started. But before we get too far, we're going to have to tackle with the fundamental problems that a mission-oriented approach will face. I liked this book, and it made me think. I do recommend it, but as a starting point for thinking about how we can do things better.

Saturday, 11 June 2022

The value of market power in alcohol sales

As Reason reported this week, Massachusetts is looking at revising its alcohol sales restrictions. The current regime is interesting because:

Massachusetts restricts license availability in two ways: creating quotas on how many licenses are available based on an area's population and restricting the number of licenses a single entity can own. While some other states have population quotas or quantity caps, very few have both of these restrictions in place at the same time.

Population quotas for alcohol retailing licenses ensure that only one license can be granted per several thousand residents in a municipality, based on a formula laid out in state law. There is a quota in place for both on-premise establishments like restaurants, as well as off-premise sellers like grocery and liquor stores.

The result of this limitation is that if a certain region has already met its quota for licenses, then any new retailers hoping to open up are barred from obtaining a license. If an alcohol seller goes out of business, however, that old license can become available on the so-called secondary market.

Since these secondary licenses are often the only chance for a new business to gain the right to sell alcohol, they have become immensely expensive in certain parts of the state. In Boston, bar liquor licenses have sold for north of $450,000. In states where population quotas do not exist, licenses can cost less than $100, underscoring the extent to which Massachusetts unnecessarily saddles its businesses with prohibitive startup costs.

If the government restricts the number of alcohol licences, then this restricts competition in the market for alcohol, and creates some market power for the holders of licences. As you can see in this 2018 post, this would lead to the price of alcohol being higher, but more importantly, the profits of sellers would be higher with the restrictions than if there was more competition. That's why a bar licence in Boston can sell for $450,000, when in other states a similar licence sells for less than $100. The prospective bar owner is willing to pay a premium for the licence, because they know that it grants them market power. They can recoup the cost of the licence through their higher future profits, generated by that market power. The value of that market power is measured by the premium paid for the licence.

Now, if the Massachusetts state government was smart, and wanted to maximise the profits they generate from the sale of alcohol licences, they would require that licences are surrendered to the state when a seller goes out of business. The licence could then be auctioned to the highest bidder. This way, the government would be able to extract nearly all of the monopoly profits from the buyer of the licence, because in theory that is the most that the buyer would be willing to pay for the licence. That the Massachusetts state government doesn't do that, and that licences are instead sold on a secondary market, is interesting in itself.

Also interesting, but not surprising, is that licence holders prefer the current regulations, over an alternative that would increase competition. An industry group represented licence holders has even gone so far as to offer a proposal that:

...cleverly packages a decrease in the most valuable type of license as an increase. Worse yet, the proposal does nothing to address the state's quota restrictions. This means that the overall pool of alcohol licenses would not increase, further condemning the state to its current restrictive system.

This is an example of rent seeking behaviour. The current licence holders are very profitable, so they have a strong incentive to try and protect (or even entrench) their current position, so that they can remain equally (or more) profitable in the future. This is why firms sometimes surprisingly prefer their own industry to be strictly regulated - it keeps out the competition.

Also interesting is that this is also a case where public health advocates might actually agree with the alcohol sellers. As I noted in my 2018 post, having local alcohol monopolies increases the price, and reduces the quantity of alcohol consumed, leading to less alcohol-related harm. This is almost literally the example of 'bootleggers and Baptists', first coined by the economist Bruce Yandle in the 1980s. Yandle noted that government regulations are often supported both by groups that propose the regulation (the Baptists), and by groups that should in theory be harmed but actually profit from the regulation (the bootleggers).

So, market power in alcohol sales may have value in two ways - value for the licence holders (higher profits) and value for public health (lower alcohol-related harm).

[HT: Eric Crampton at Offsetting Behaviour]

Read more:

Thursday, 9 June 2022

Ethnic diversity of local government and decision-making gridlock

Is it better to have more (ethnically) diverse local government, or less diverse local government? There are valid theoretical arguments in both directions. If local government leaders (e.g. elected council members) are more diverse, then they will have a diversity of opinions and preferences, possibly leading to more disagreements and less consensus decision-making, and government may become 'gridlocked', unable to make key decisions. On the other hand, local government leaders do feel electoral pressure, including the pressure to conform, and to the extent that there is effective electoral pressure, gridlock would not be a problem.

Whether more diverse local governments spend less on public goods (or not) is the subject of this 2017 article by Brian Beach (College of William & Mary) and Daniel Jones (University of South Carolina), published in the American Economic Journal: Economic Policy (ungated version here). They use data from the:

...California Election Data Archive (CEDA), which provides the names and number of votes for every candidate in every local government election occurring between 1995 and 2011.

For the 5177 candidates who won elections (or were close but lost) over the period from 2005 to 2011, they collect data on ethnicity, either directly from city councils, or by asking workers on Amazon Mechanical Turk (mTurk) to classify the candidates' ethnicities. They got 10 mTurk workers to classify each candidate, and had 94 percent agreement overall (they drop the 31 candidates who had low agreement from their sample). This was similar to the level of agreement between mTurk workers and city council data (95 percent).

Beach and Jones then measure the ethnic diversity of each city council, using indices of fractionalisation and polarisation. As they explain:

Both indices range from zero to one, where zero corresponds to a situation where there is no diversity. Fractionalization is maximized when each council member is of a different ethnicity. Polarization, on the other hand, is maximized when the seats are distributed into two ethnic groups.

The outcome variable that Beach and Jones are most interested in is public goods expenditure, which they calculate:

...by taking a city’s total expenditures for the year and removing expenditures on “government administration” and debt repayment. The “public goods” category therefore includes all spending on roads, parks, police protection, sewerage, public transportation, etc.

Now, a simple regression approach would be to look at the relationship between diversity (fractionalisation and polarisation) and public goods spending. The problem with that approach is the potential for endogeneity - maybe there are city-level factors that affect both the diversity of local government and local public goods spending. For example, perhaps having a more diverse population requires a greater variety of public goods, and more public goods spending, but also tends to lead to a more diverse city council. In that case, the relationship between diversity of the council and public goods spending is biased because of the relationship of both variables to the diversity of the population overall. Beach and Jones deal with that problem by looking at what happens subsequent to close elections, where one of the candidates is the majority ethnicity, and one is a minority. In sufficiently close elections, it is close to random which candidate is ultimately elected, provided some random variation in the diversity of the council, that doesn't depend on any other variable.

Beach and Jones identify 684 close elections with candidates of different ethnicities over the period from 2006 to 2009. Using that data, they find that:

Regardless of whether we measure diversity with fractionalization... or polarization... there is a strong and positive relationship between the election of a non-modal candidate and the diversity of the city council.

No surprises there. Electing a minority candidate increases the diversity of the council. Moving onto the effect on public goods, they find that:

...per capita spending on public goods falls by approximately 13 percent (significant at the 1 percent level) following the election of a non-modal candidate. The effect on nonpublic goods spending remains positive (roughly 14 percent) but is not significant at conventional levels.

So, more diverse local governments spend less on public goods. Beach and Jones then drill down into potential mechanisms that explain their results, and the consequences, and show that:

Our results indicate that diversity leads to gridlock. Cities reduce the amount they spend on public goods as their city council becomes increasingly diverse. These effects are largest for segregated cities and cities with more income inequality (where the potential for disagreement may be largest). We also find that all members of a council that experienced an exogenous shock to diversity receive fewer votes when they run for reelection. This latter point suggests that the city’s population is dissatisfied with the decline in public goods, ruling out the possibility that diverse councils simply achieve greater efficiency in public good provision.

So, ethnic diversity of local government appears to encourage gridlock, reducing local public goods spending, and it isn't an outcome that citizens favour. However, one thing that Beach and Jones didn't consider, is whether (or to what extent) a match between the majority ethnicity of local government and the majority ethnicity of the population matters. Or whether the effect is different at different levels of ethnic representativeness. Those would be interesting follow-up questions.

Also, the negative implications of this research need to be juxtaposed with the problem of groupthink. Groupthink occurs when there is too much consensus, leading to decision-making that lacks critical evaluation. This is more likely when the group of decision-makers is less diverse. So, perhaps the quantity of public goods spending is higher when there is less diversity in local government, but perhaps the quality of that spending is lower?