Saturday, 31 July 2021

Exposure to foreign-born students and the academic performance of U.S.-born students

Following on from yesterday's post about international programmes in the Netherlands (which was really about exposure to foreign students rather than international programmes per se), earlier this week the Economics Discussion Group at Waikato discussed this NBER Working Paper by David Figlio (Northwestern University) and co-authors. Figlio et al. looked at the effect of foreign-born high school students on the academic performance of U.S. born students. As with the paper by Wang et al. I wrote about yesterday, there is a selection problem and you can't simply compare U.S.-born students in schools with more foreign-born students with U.S.-born students in schools with more foreign-born students. As Figlio et al. explain:

First, immigrant students are not randomly assigned to schools, and are more likely to enroll in schools educating students from disadvantaged backgrounds... Second, US-born students, especially those from comparatively affluent families, may decide to leave when a large share of immigrant students move into their school district. Indeed, evidence shows that in the US, following an influx of disadvantaged students and immigrants, affluent, especially White, students move to private schools or districts with higher socio-economic status (SES) families, a phenomenon which has been labeled “white flight”... Both of these factors imply that immigrant exposure is negatively correlated with the SES of US-born students. Therefore, research that does not address the non-random selection of US-born students is likely to estimate a correlation between immigrant exposure and US-born student outcomes that is more negative than the true relationship.

Figlio et al. have access to very detailed data from the Florida Department of Education, which they link to birth records. This allows them to compare siblings, which is a smart way of dealing with the selection problem. Since both siblings tend to go to the same school, comparing siblings deals with the first selection issue above, since school-specific effects will be the same for both siblings. And because parents would likely move both siblings to a new school if they move one of them, then that deals with the second issue. Importantly, because each sibling is in a different grade, and the number of foreign-born students is changing over time, that provides variation between the siblings in their exposure to foreign-born students, and it is that variation that Figlio et al. test, for its association with academic performance.

Their dataset contains information on all K-12 students in Florida from 2003-03 to 2011-12. The most restrictive analysis (of siblings) has a sample size of more than 1.3 million (out of a total sample of over 6.3 million U.S.-born students who speak English at home). The key outcome variables are student performance on standardised reading and mathematics tests, which they standardise (so all results are measured in terms of standard deviations). Comparing foreign-born (which they label 'immigrant') students with U.S. born students in terms of test results, they find that:

Immigrant students’ performance in math (-0.097) and reading (-0.206) is lower than the one of US-born students (0.044 and 0.052).

Not too surprising then. In their main analysis, they find that:

...once selection is accounted for with family fixed effects, the correlation between cumulative immigrant exposure and academic achievement of US-born students is positive and significant. Moving from the 10th to the 90th percentile in the distribution of cumulative exposure (1% and 13%, respectively) increases the score in mathematics and reading by 2.8% and 1.7% of a standard deviation, respectively. The effect is double in size for disadvantaged students (Black and FRPL [free-or-reduced-price-lunch] eligible students). For affluent students the effect is very small, suggesting that immigrant students do not negatively affect US-born students, even when immigrants’ academic achievement is lower than the US-born schoolmates.

As with the Wang et al. paper from yesterday's post though, these results do not definitively demonstrate causality. However, Figlio et al. do a bit more digging, and find:

...suggestive evidence that the effect on US-born students is larger when the immigrants systematically outperform US-born students. Overall, these results suggest that immigrant students do not affect negatively US-born students, even when the immigrants’ academic achievement is lower than the US-born students, and may have a positive impact on US-born students when immigrants outperform them.

These are important results, and clearly run counter to the narrative that underlies the phenomenon of 'white flight' - that immigrant students make domestic-born students worse off academically. Although more research is needed to identify whether these results are causal, and to further explore the mechanisms that drive them, it is clear that diversity is better than (or at least not as bad as) many people expect.

[HT: Marginal Revolution]

Friday, 30 July 2021

The wage premium from studying in international programmes in the Netherlands

The Bologna process was an attempt to harmonise the higher education systems across Europe by adopting some common standards, and began in 1999. One of the consequences was a large increase in international programmes at postgraduate level, and many of those programmes are taught in English, which reduces to some extent the language barriers (at least, to the extent that English is a common language). How do graduates of these international programmes fare after graduation? That is the research question that this recent discussion paper by Zhiling Wang (Erasmus University Rotterdam), Francesco Pastore (University of Campania Luigi Vanvitelli), Bas Karreman, and Frank van Oort (both Erasmus University Rotterdam) addresses.

They compare Dutch students who studied in international programmes (in the Netherlands) with Dutch students who studied in domestic programmes. A simple comparison of those groups would be problematic though, because of self-selection into programmes. If better than average students tend to choose international programmes, then this would bias upwards the estimated effect of those programmes.

Wang et al. try to get around this by using a matching estimator - specifically coarsened exact matching (which I hadn't heard of before). Essentially, they match students in the international programme with students that fit into the same category, where categories are based on a combination of gender, year of graduation, field of study and university, cultural diversity in the bachelor programme, neighbourhood-level education, and father's income. They then compare students in the international programme with their matched groups who studied in a domestic programme.

Wang et al. use data from the graduates of Masters programmes of thirteen Dutch universities over the period from 2006 to 2014, with education data linked to detailed administrative data by Statistics Netherlands. The full sample has over 29,000 students, while the matched sample reduces this to around 8000). Comparing the labour market outcomes of the matched samples, they find that:

...students from international programmes obtain a wage premium of 2.3% starting from the 1st year after graduation, ceteris paribus. The wage premium keeps increasing by about 1% every year.

That's quite a large effect. So, what explains it? Wang et al. dig a little bit into the mechanisms underlying the difference, and find that:

...the wage premium is largely driven by differential choices in the first firm upon graduation, rather than cross-firm mobility or faster upward mobility within firm. Upon graduation, Graduates from international programmes are much more likely to choose large firms that have a higher share of international employees and have business of trade for their first jobs. [They] get a head start in wage level and the initial wage advantages persist in the long-run.

Joining an international programme appears to be an excellent investment. However, there are a couple of limitations with this study. First, the matching estimator does ensure that the comparisons are between students that are very similar in terms of their observable characteristics. However, that doesn't ensure that they are similar in terms of unobserved characteristics. Clearly, the two groups are different in some way that induces some students to choose an international programme, and others to choose a domestic programme. That choice is not random. That means that these results do not demonstrate causality - we can't say for certain that an international programme causes the wage premium.

Second, the definition of international programmes was a little odd to me. Wang et al. use a data driven approach. As they explain:

...we compute proxies for programme-level internationalization based on detailed information of students’ ethnic composition. We classify all students into different cells by year of graduation and study programme. For each cell, we make a count of “most likely English-taught foreign students” that satisfy three criteria: First, they are first-generation immigrants... Second, they have never lived in the Netherlands before they start master programmes... Third, they are originally from non-Dutch-colony areas, non-Dutch-speaking countries or non-German-speaking countries... Our preferred measure is that when the number of these students exceeds 4 or the share of these students exceeds 25% for the first time, the study programme is regarded as an international one from that year onwards.

Their proxy for an international programme is really proxying the exposure to foreign-born or foreign-educated students. It isn't so much about international programmes at all. It surprised me that they didn't simply investigate which of the university programmes are described as international programmes, taught in English, etc. and use that to construct a measure instead.

Nevertheless, exposure to diversity is important (more on that in my next post), and this research is at least suggestive that such exposure may lead to better jobs in larger and more internationally-linked (and, importantly, higher paying) firms.

[HT: Jacques Poot]

Thursday, 29 July 2021

The Commerce Commission's supermarket report and consumer decision making under complexity

The Commerce Commission released its long-awaited draft market study into the grocery sector today. As the New Zealand Herald reported:

Supermarkets could be forced to sell their wholesale businesses or some sites to boost competition, after the Commerce Commission warned competition in the $22 billion grocery sector "is not working well for consumers".

While the Commerce Commission's draft market study into supermarkets did not say how much cheaper groceries should be, it said it would expect consumers to pay less if competition was better.

"If competition was more effective, retailers would face stronger pressures to deliver the right prices, quality and range to satisfy a diverse range of consumer preferences," Commerce Commission chair Anna Rawlings said...

"The major retailers appear to avoid competing strongly with each other, particularly on price. Meanwhile, competitors wanting to enter the market or expand face significant challenges, including a lack of competitively priced wholesale supply and a lack of suitable sites for large scale stores," Rawlings said.

The report, which you can read here (all 517 pages of it, but don't worry, there is a shorter executive summary as well), goes into the upstream (supermarkets' dealings with their suppliers) side of the market as well as the downstream (how consumers are affected). On the latter point, the New Zealand Herald article reports:

Rawlings said the pricing and loyalty programmes were so complex as to be confusing, making it difficult to make informed decisions. Consumers often did not appear to know how much personal data they were giving to the supermarkets when they signed up, or how it was used.

The rewards also require high levels of spend for a relatively low return. In one prominent loyalty scheme, shoppers are required to spend as much as $2000 to receive a $15 voucher in return.

I made a modest contribution to this report, conducting some experimental research with my colleague Steven Tucker into consumer decision-making under uncertainty. Our report is also available on the Commerce Commission website (see here). Specifically, we looked into how consumers' purchasing decisions are affected by having multiple pricing schemes to choose from simultaneously, and whether consumer welfare (measured by consumer surplus) was affected by having more complexity (in the form of more pricing schemes and more complicated schemes to choose from). We also looked into whether displaying unit prices induced consumers to make more optimal decisions.

We ran a number of experimental sessions in the Waikato Experimental Economics Laboratory earlier this year, where students participated in experiments and could earn real money based on their decisions. Each experimental session consisted of several decision rounds, in each of four stages. In each decision round, the research participants chose how much (if any) of a fictitious good they wanted to buy, faced with one or more pricing schemes and with a known schedule of 'buy-back values' (essentially, we offered to buy back any units of the good that the research participants bought, and gave them a schedule of the amounts they would be paid for different quantities). The use of buy-back values means that we know what the underlying demand curve is, and can calculate the optimal quantity that the research participants should purchase in order to maximise their consumer surplus. The four stages of the experiment were: (1) a single pricing scheme (of which there were several); (2) multiple pricing schemes, but participants could choose to buy only from one scheme; (3) multiple pricing schemes, and participants could buy from one or more of the schemes (this was the most complex); and (4) the same as stage 2, but with unit prices displayed for each pricing scheme.

There is a lot more methodological detail and the full results in our report. The Commerce Commission has picked out the bits that they felt were most relevant, and included those in their own report. In short, we found that:

...multiple discounting schemes do induce suboptimal decision making on the part of consumers. They are less likely to choose the optimal consumption bundle when faced with multiple pricing schemes, and the average welfare loss (loss of consumer surplus) is higher than when faced with no discounting and a single simple pricing scheme... Finally, we find weak statistical evidence that displaying unit prices mitigates the effects of multiple discounting schemes on the optimality of consumer decision-making.

The takeaway message is that complexity makes it difficult for consumers to optimise. It is attractive to believe that, when faced with a variety of different prices, consumers can easily identify the optimal consumption bundle. However, the real world isn't like an idealised model of rational utility-maximising consumers. When faced with multiple pricing schemes, some as part of loyalty programmes and some not, and some involving quantity discounts and others not, consumers can easily stray from the optimal decision. And if you layer on top of that a multiplicity of brands, different styles of promotions, packaging differences, and more, it is easy to see that this complexity can make a consumer worse off. Rather than optimising, consumers may respond to this complexity by satisficing (as Nobel Prize winner Herbert Simon termed it) - choosing an option that is 'good enough'. Moreover, having a multiplicity of pricing and promotional schemes can create the illusion of consumers getting a good deal, when in practice it makes it more difficult for them to do so.

The Commerce Commission is recommending mandatory unit pricing and that the supermarkets simplify their promotional practices and make the terms and conditions of their loyalty schemes clearer and more transparent. It will be interesting to see how those recommendations are received.

Read more:

Wednesday, 28 July 2021

Hershey Friedman makes a case against higher education

In a new article published in the American Journal of Economics and Sociology (ungated earlier version here), Hershey Freidman (City University of New York) penned a very strong critique of higher education. The article is titled "Is Higher Education Making Students Dumb and Dumber?". Hershey writes:

What is truly amazing is that a four-year college degree can actually teach students to be stupid. At the very least, numerous students will lack many of the needed critical skills after they graduate. The reason for this is that professors and teaching assistants themselves lack some important skills, which include: 1) an appreciation of uncertainty, 2) respect for other disciplines, and 3) an understanding of what true diversity is all about.

Ouch. In relation to the first point (uncertainty), Hershey writes:

We are observing arrogant people ranging from academics to doctors to politicians who are certain of their facts. Unfortunately, not all the information available to the public is reliable. Many theories are flawed and are proven false once tested. Researchers speak of evidence-based medicine, evidence-based management, evidence-based practice; unfortunately, that “evidence” is often unreliable. Professors should be teaching students about the dangers of certainty. Instead, they are guilty of the same crime...

One of the most important things we can teach students is not to fall into the certainty trap. It is good to be unsure. There is nothing wrong with having doubts. We must teach students to understand that very little is known with certainty and there is nothing wrong with having some humility. Educators must stress that what they are teaching today might be refuted in a few years. The bottom line is that education is not about indoctrination, it is about critical thinking and attempting to minimize cognitive biases. 

I'll put my hand up and say that I teach economic theory with a good deal more certainty than it warrants. However, that's tempered by a good deal less certainty about the policy prescription, and having outlined from the beginning (especially in my ECONS102 class) that the assumptions of economic models matter, even if not all of the assumptions are laid out in complete detail. Hershey isn't just directing his comments at economics though, but all disciplines. A bit more humility about our disciplinary foundations and approaches is no doubt a good thing. Which brings us to Hershey's second point:

The academic department structure encourages a silo mentality and discourages interdisciplinary work and collaboration. Indeed, faculty are often encouraged to publish in a narrow area in order to receive tenure. Publishing outside one’s area is often a good way to be denied tenure. Most colleges are probably better known for turf battles than for communication and collaboration across disciplines and even sometimes across subareas in the same discipline...

Students majoring in one discipline are taught that it will provide the answers to all questions. Unsurprisingly, this kind of thinking is not encouraged in the corporate world. Collaboration, team work, learning organizations, and knowledge sharing are the mantras at most companies.

This is a fair criticism. Students have called me out for being unfair to marketing as a discipline in my lectures. I know I'm definitely not the only academic passing comment about other disciplines, and economics certainly comes in for a lot of criticism from other social sciences. The point about interdisciplinarity is important though. We want our students to be able to synthesise across multiple disciplinary traditions (which was part of the underlying premise of David Epstein's book, Range (which I reviewed a couple of weeks ago). If we want students to appreciate an interdisciplinary approach, we need to ensure that they are open to other disciplines, and that is something we as academics should model. Diversity is important, which is Hershey's third point:

Academe has been at the forefront of fighting for various kinds of diversity in higher education as well as the workplace and corporate boardroom. There is evidence that companies with more diverse workforces perform better financially than those with less diverse workforces... The kinds of diversity that academe has stressed include gender diversity and ethnic diversity; LGBTQ diversity is also being promoted by the academic world. There is one kind of diversity, however, that is virtually ignored by higher education: diversity of opinion...

Any organization that wants to flourish and be innovative has to create a climate where adversarial collaboration is encouraged. Unfortunately, we often see the opposite approach used in many organizations: executives prefer to surround themselves with sycophants. In academe, entire departments have the same opinions and approaches.

Again related to Epstein's book, diversity within teams is important, and teams do best when the members of the team bring different viewpoints. Again, this is something we could do a better job of modelling in higher education.

I do think that Hershey is overstating his argument though. The assertion in the title, that higher education actually makes students dumber, isn't really supported by evidence. However, that doesn't mean that we can't do things better. And the three areas that Hershey highlights (appreciating uncertainty, respect for other disciplines, and encouraging diversity of opinion) are definitely areas that we can work on.

Monday, 26 July 2021

A novel pricing strategy for a stock-market-themed restaurant

In ECONS101, we include an entire topic on pricing strategy (although it also includes some elements of non-pricing strategy). This is something that sets ECONS101 aside as a business economics paper rather than an economics principles paper.

The purpose of pricing strategy is simple - the firm is trying to maximise profits, and by undertaking some more 'exotic' pricing strategies, they may be able to increase profits beyond the profit that is achievable by selling at a simple single price-per-unit. Examples of alternative pricing strategies include price discrimination, block pricing, or two-part pricing. The underlying principle is that the firm wants consumers who are willing to pay more for the good or service, to pay more for it - that is how they extract additional profits from their consumers.

This stock market themed restaurant has stretched pricing strategy beyond what we normally see. They change prices every fifteen minutes, "based on supply and demand". Drinks that are more popular on the night go up in price, and those that are less popular go down in price. There are even period "market crashes", where the price of all drinks falls. And I looked it up - it's a real restaurant in Michigan called The Beer Exchange, with locations in Kalamazoo and Detroit. I'm not so sure that their pricing is based on supply and demand, because their explanation focuses purely on the demand side. However, that point aside, could this unique pricing strategy be effective in increasing profits?

As a gimmicky restaurant, The Beer Exchange probably attracts many tourists. That's an important point, because tourists' preferences (and their willingness to pay for different drinks on the menu) will be less known to the restaurant than the preferences of regular customers. The restaurant wants to charge more for drinks that their customers are willing to pay more for, so the restaurant wants to know a bit about the consumers' preferences.

One way to work out consumers' preferences experimentally is to set the prices initially at a moderate level, and gradually adjust prices as you learn more about customers' preferences. What the restaurant should recognise from the experimental exercise is that, if customers are buying a lot of something, then the price is probably too low. And, if they're avoiding buying something, then the price is probably too high. However, this sort of experiment takes a lot of time and effort to set up, and isn't going to be quick enough to pick up the preferences of a group of consumers who are only at the restaurant on a single night.

That's because most restaurants can't dynamically adjust their prices in response to the consumers that are already on the premises. The menu and prices are reasonably fixed, and consumers would probably frown on prices that adjusted constantly (this is one reason why tickets to sports games are priced much the same over an entire season, regardless of differences in demand between games). However, The Beer Exchange has built the dynamic adjustment of pricing explicitly into their business model. Because the theme of the restaurant is a stock market, their customers expect these dynamically adjusting prices. There is going to be little backlash to raising prices over the course of a night. That means that The Beer Exchange can hone in on the profit-maximising drink prices for the consumers that are there on a particular night. Very smart.

Finally, the market crashes are probably included simply as a mechanism to avoid prices becoming too high. I expect that whatever algorithm The Beer Exchange uses to adjust prices tends to push prices up more often than down. That is probably intentional, because the after-dinner crowd is likely to be willing to pay more for drinks than the dinner crowd is. So, the restaurant will want prices later in the evening to be higher than prices earlier in the evening. However, they don't want prices to go too high, because consumers would stop buying drinks and profits would fall.

Overall, this is a smart and novel use of pricing strategy to extract some additional profits from consumers.

[HT: Sarah Cameron]

Sunday, 25 July 2021

Video games and class attendance

I've been reading a lot of the research literature lately, on the impact of class attendance on student performance. That relates to writing up some of my own research, based on an experiment I conducted on the ECONS101 class in 2019 (and would have done again in 2020 and this year, if the pandemic hadn't intervened). Anyway, I'll blog about that experiment a bit more in a future post. In this post, I want to talk about this 2018 article I had in my (far too large) to-be-read pile, by Michael Ward (University of Texas at Arlington), published in the journal Information Economics and Policy (ungated version here).

Ward used data from the American Time Use Survey from 2005 to 2012, and looked at the impact of video game playing on the amount of time devoted to class attendance and homework completion, among high school and college students. Ordinarily, this would be a difficult question to get a causal estimate of, as Ward explains:

...a potential negative association between video game play and time devoted to learning may be due to selection of individuals with different preferences for learning as well as a causal result of crowding out. It is possible that marginally performing students are less attached to school and invest less in human capital. Marginally performing students also may have a preference for video game playing. Even without a difference in preferences, they may allocate some of the time freed up from reduced participation in educational activities toward video game playing. In both cases, we would expect a negative correlation between gaming and educational inputs.

In other words, we might observe a negative correlation between video game playing and class attendance because the types of students who play video games are also the types of students who don't attend class anyway, or because students who don't attend class have more time and could therefore use more of that time to play video games (this would be a case of reverse causation).

Ward gets around this problem using instrumental variables analysis. As he explains:

I construct an instrumental variable from video game popularity. When the currently available games are perceived to be higher quality, the utility from playing video games rises. This is a temporary increase because the attractiveness of video games tends to fall quickly with cumulative time played. This temporary increase in marginal utility can result in large swings in the sales of video games from week to week. Thus, week-to-week variation in video game sales will be a valid instrumental variable if it affects time spent playing video games but has no direct effect on time spent on educational activities.

Video game sales can act as an instrument for time spent playing video games because it has no direct impact on time spent studying (in-class or on homework). Ward limits his analysis to weekdays, avoiding the summer holidays and the period between Thanksgiving and the end of the year, he has a sample of 3016 observations of daily time use. Looking at the impact of video game sales for each day (combined with the day before) on study time, he finds that:

A one standard deviation in video game sales leads to an average reduction in class time of about 16 minutes which corresponds to nearly a 10%% [sic] reduction. Video game time is consistently estimated to decrease homework time but this result is smaller and not always statistically significant. The marginal effect of gaming on homework by males is larger than for females but the effect on class attendance is not different from females.

So, video game playing does reduce class attendance, and the reduction is approximately one-for-one (for every additional hour spent playing games, class attendance reduces by an hour). Students aren't trying to make up for it by additional studying outside of class either, so that suggests there is likely a negative impact on student performance (to the extent that class attendance improves student performance). Unfortunately, as much as teachers may wish otherwise, this is the sort of exogenous impact on attendance that it would be difficult for any teacher to combat. Making classes more interactive and encouraging attendance constantly runs up against student preferences for leisure activities.

Finally, Ward's analysis combined all gaming (including console games, computer games, mobile games, and board games [!]). It would be interesting to see if mobile gaming (which is even more prevalent now than it was in the 2005 to 2012 period that Ward's data comes from) has different effects from other gaming. Since, by definition, mobile gaming can be performed anywhere, it might not affect class attendance by as much (although it might affect the extent to which students pay attention in class!). Unfortunately, the time use data doesn't disaggregate gaming further, so it will take a whole other study to answer that question.

Saturday, 24 July 2021

This ain't a (crime)scene, it's an arms race

Gun violence has been in the news a lot recently, with incidents in Hamilton and Auckland among others, and predictably that has reignited the debate over the arming of police in New Zealand. However, we should resist the temptation. To see why, let's apply a little bit of game theory.

The arms race game is a variation of the prisoners' dilemma. For simplicity, let's assume that there are two players in our game, the police and the criminals. Both players have two strategies, to arm themselves or to disarm (or not arm). Both police and criminals are best off if they are armed and the other group is not, but if both are armed this escalates the potential for violent confrontation. Both players are making their decisions at the same time, making this a simultaneous game. The game is outlined in the payoff table below.

To find the Nash equilibrium in this game, we use the 'best response method'. To do this, we track: for each player, for each strategy, what is the best response of the other player. Where both players are selecting a best response, they are doing the best they can, given the choice of the other player (this is the definition of Nash equilibrium). In this game, the best responses are:

  1. If the criminals choose to arm, the Police's best response is to arm (since -5 is a better payoff than -10) [we track the best responses with ticks, and not-best-responses with crosses; Note: I'm also tracking which payoffs I am comparing with numbers corresponding to the numbers in this list];
  2. If the criminals choose to disarm, the Police's best response is to arm (since 10 is a better payoff than 5);
  3. If the Police choose to arm, the criminals' best response is to arm (since -5 is a better payoff than -10); and
  4. If the Police choose to arm, the criminals' best response is to arm (since 10 is a better payoff than 5).

Note that the Police's best response is always to choose to arm. This is their dominant strategy. Likewise, the criminals' best response is always to choose to arm, which makes it their dominant strategy as well. The single Nash equilibrium occurs where both players are playing a best response (where there are two ticks), which is where both the Police and criminals choose to arm. This leads to escalation of violence and is worse off for everyone than if both players chose to disarm. That demonstrates that the arms race is a type of prisoners' dilemma game (it's a dilemma because, when both players act in their own best interests, both are made worse off). If both police and criminals follow their dominant strategies, eventually we will end up in a situation like the U.S., with growing militarisation of police, which makes both police and citizens worse off.

How do we avoid this outcome? First, we need to recognise that the arms race game outlined above is not a one-shot game, it is a repeated game. That means that it is effectively played not once, but many times. In repeated games, the players are more likely to be able to work together to move away from the Nash equilibrium, and ensure a better outcome for all.

Let's assume for the sake of argument that the game is played weekly, and decisions are made simultaneously each week. Both police and criminals recognise that it is in their long-run interests to disarm. If they choose not to arm, and can trust the other player also not to arm, everyone benefits (or, rather, everyone faces a lower cost, since crime would continue, but it wouldn't be aggravated to the same extent by firearms).

Can each side really trust the other? That's the problem. In a repeated prisoners' dilemma game like this, each player can encourage the other to cooperate by using the tit-for-tat strategy. That strategy, identified by Robert Axelrod in the 1980s, works by initially cooperating (disarming), and then in each play of game after the first, you do whatever the other player did last time. So, if the criminals armed last week, the Police arm this week. And if the criminals disarmed last week, the Police disarm this week. Essentially, that means that each player punishes the other for not cooperating, by themselves choosing not to cooperate in the next play of the game. It also means that each player rewards the other for cooperating, by themselves choosing to cooperate in the next play of the game. However, the tit-for-tit strategy doesn't eliminate the incentive for either side to cheat. Can the Police trust the criminals not to arm, if they knew for sure that the Police would not arm?

A more severe form of punishment strategy is the grim strategy. This involves initially cooperating (like the tit-for-tat strategy), when the other player chooses not to cooperate, you switch to not cooperating and never cooperate again. You can see that this strategy essentially locks in the worst outcome in the game. And unfortunately, arming the police is a type of grim strategy if it is difficult to back out of once the decision is made (which, judging by the experience in the U.S., may well be the case).

We need another option. If the criminals are going to arm, and Police can't trust them not to, and we want to avoid the grim strategy, we end up in the outcome where Police (and society more generally) have the worst outcome, and criminals have the best. To encourage criminals not to arm, Police need to be armed at least some of the time. At least enough of the time that the long-run best outcome is for criminals not to arm all of the time. If criminals get a payoff of 10 for sure each week, they would surely arm. But if they get a payoff of 10 some weeks, but -5 in other weeks, then they might be better off on average to not arm.

The challenge in this approach is that the Police need to find the right balance. If criminals are arming more often (as appears to be the case right now), then Police need to arm a little bit more as a deterrent. This doesn't mean that having all Police routinely armed is the right solution, only that a little movement in that direction might help to restore an uneasy but preferable outcome.

Let's not finish on such a gloomy note. Time for a musical interlude, courtesy of Fall Out Boy (since I ripped off their song title for the title of this post, it seems only fair):

Wednesday, 21 July 2021

Old boys' clubs and upward mobility at Harvard and beyond

Upward mobility (the movement of a person from a lower socio-economic class to a higher class, or from lower-income to higher-income, or from lower-wealth to higher-wealth) tends to be much lower than many of us think. Income mobility is also closely related to income inequality, as demonstrated by cross-country data in what is known as the 'Great Gatsby curve':

This Great Gatsby curve (taken from Wikipedia) shows a positive relationship between inequality (on the x-axis) and intergenerational immobility (on the y-axis). Immobility is, of course, the opposite of mobility, so this curve demonstrates a negative relationship between inequality and income mobility. Countries that have higher inequality have less income mobility. In other words, high inequality today tends to lock in high inequality in the future, and that's probably not a preferable outcome. At the least, it inhibits merit-based success from driving upward mobility.

The persistence of intergenerational immobility is related to inequality in social status. So, I was interested to read this new NBER Working Paper (ungated earlier version here) by Valerie Michelman (University of Chicago), Joseph Price (Brigham Young University), and Seth Zimmerman (Yale University). They looked into the effects of old boys' clubs at Harvard, how those clubs shape upward mobility, and whether interaction with high-status peers can increase the chance of upward mobility. Before I get to the results, here is how Michelman et al. briefly introduce Harvard's final clubs:

Social life at Harvard centered on exclusive organizations known as final clubs, so-called because they are the last clubs one joins as a Harvard student. These clubs, which Amory (1947) describes as the “be-alls and end-alls of Harvard social existence,” are hundreds of years old and count among their members multiple US Presidents.

Michelman et al. collect administrative data on Harvard entering classes from 1919 to 1935, including home addresses, where they dormed at Harvard, club membership, and academic rank. They match those records to 25th Reunion class reports, and where possible to data from the Census. A key feature of the dataset is that room assignments were effectively random, so students could not choose the 'neighbourhood' in which their room was located. The price of rooms varies, so the average price of each neighbourhood varies as well. More expensive neighbourhoods would include more high-status students, being those who went to 'private feeder schools', than less expensive neighbourhoods. Michelman et al. use this randomisation to compare students who dormed in these differently-priced neighbourhoods in terms of outcomes at Harvard and afterwards.

The paper has a huge amount of detail, but in short, they find that:

...exposure to high-status peers helps students achieve social success in college, but that overall effects are driven entirely by large gains for private feeder students. A 50 percentile shift in the room price distribution raises membership in selective final clubs by 3.2 percentage points in the full sample (34.2% of the mean). For private feeder students, the same shift raises membership by 8.4 percentage points (37.8%), while effects for other students are a precise zero. These effects build on similar patterns we observe starting in students’ first year at college. A 50-percentile increase in neighborhood price raises the count of first-year activities by 11.2% overall, with larger gains for private feeder students and small, statistically insignificant effects for others. Looking across activities, effects are largest for leadership roles, where baseline gaps in participation by high school type are also largest.

In other words, interacting with high status peers encourages more social activities, including final club membership, but only among students who are already high status. Score one for immobility. They also find that:

25 years after graduation, a 50 percentile change in peer neighborhood price raises the chance that students participate in adult social organizations by 8.7%. As with on-campus clubs, the overall long-run effects are driven entirely by large gains (26.1%) for private feeder students, with near-zero effects for others.

The social club effects observed while students are at Harvard persist into adulthood. They also affect incomes and social status:

Turning to occupations, a 50 percentile change in neighborhood price rank raises the share of private feeder students in finance by 7.2 percentage points, 39.7% of the group mean.

For high status students, interacting with other high status students is in turn associated with high-income careers in finance. Importantly, none of this is driven by academic performance. The high status students from private feeder schools consistently perform significantly worse than other students in terms of academic rank.

All of this analysis is based on data from Harvard students from the 1920s and 1930s. Surely things have improved since then? Michelman et al. collect data from more recent cohorts (although not in as much detail) up to the class of 1990, and find that:

Harvard changes profoundly over this time, enrolling women and many more non-white students. However, cross-group differences in academic performance persist. Public feeder students outperform private feeder students and Jewish students outperform Colonial students over the full 1924-1990 period. Differences in career outcomes change shape. Finance career choices by high school type and ethnicity converge, while gaps in academic careers and MBA receipt emerge, and gaps in medical careers remain large. Overall, students from high-income families at elite universities continue to earn more than other students at those universities.

So, maybe there are some cracks beginning to show in the upward immobility of students at Harvard. However, the persistent post-graduation income differences between high status students and other students remain. Clearly, social groups continue to matter greatly.

[HT: Marginal Revolution]

Tuesday, 20 July 2021

The minimum wage, the living wage, and the effective marginal tax rate

Advocates for the living wage tend to ignore that workers that currently receive the minimum wage also receive a lot of other government support, in the form of various rebates and subsidies, that they may not be eligible for if they earned a lot more. That means that increasing the minimum wage to the living wage would not necessarily lead to gains in net earnings that are as high as those advocates expect.

A worked example, based on U.S. data, is provided in this article by Craig Richardson. Here is the key figure:

Increasing the hourly wage from US$7.25 per hour to US$15 per hour would net a full-time worker only US$198.94, after accounting for all of the social benefits they would lose, and the additional taxes they would pay. Richardson writes:

There are some uncomfortable truths about raising the minimum wage from its current level of $7.25 per hour to $15 per hour that are revealed by an online tool created by our Center for the Study of Economic Mobility (CSEM) at Winston-Salem State University, along with our local research partner Forsyth Futures.

The tool, which we call the Social Benefits Calculator, enables anyone to go online and experience for themselves what it is like to be receiving social benefits and experience a monthly wage increase. Designed for Forsyth County, the calculator shows that with more than a 100% rise in the minimum wage, many people who currently receive social benefits will barely experience a change in their standard of living...

Let’s use the calculator and create a hypothetical example: a full-time working parent earning the minimum wage, who is unmarried with two children in subsidized day care. As seen in Table 1, after his or her wages more than double from $7.25 an hour to $15 an hour, earnings rise from $1,160 to $2,400, or a $1,240 change.

Sounds good, right? That’s an enormous bump up of wages by 106%. But after subtracting the decrease in benefits and higher taxes, that $1,240 increase erodes to just a $199 net improvement, or just a 16% change.

Imagine getting a big raise and seeing 84% of it go away. 

The effective marginal tax rate (EMTR) is the amount of the next dollar of income a taxpayer earns that would be lost to taxation, decreases in rebates or subsidies, and decreases in government transfers (such as benefits, allowances, pensions, etc.) or other entitlements. Taking the example of the table above, increasing the worker's wage from US$7.25 per hour to US$15 per hour increases their monthly before-tax-and-transfers income from $1160 to $2400. In other words, their income increases by $1240. With that higher income, they pay more federal and state taxes. Their monthly tax payments increase from $88.74 to $314.70. In other words, they pay an additional $225.96. The marginal tax rate over that interval is 18.2% (calculated as [225.96 / 1240]). But wait! Their entitlement to social benefits decreases from $3110.10 to $2295 monthly. In other words, they lose $815.10 in entitlements, as well as the additional taxes they pay. So, their effective marginal tax rate over the interval is 84.0% (calculated as [(225.96 + 815.10) / 1240]).

EMTRs are something that policy makers should keep a close eye on. When the EMTR gets too high, it can create some perverse outcomes. In some cases, workers could actually be financially better off by working less (this happens whenever the EMTR exceeds 100%). The problem is that every program has its own eligibility rules and thresholds, and keeping track of how the interaction between all of those works is incredibly difficult. You can try out the Social Benefits Calculator tool mentioned in the article here (it is based on data for Forsyth County, North Carolina). We really need a similar calculator for New Zealand.

[HT: Marginal Revolution]

Monday, 19 July 2021

Who will really gain from the electric vehicle subsidy?

The government is pressing ahead with introducing subsidies on electric vehicles. However, that comes with unintended consequences, as Newsroom reported last month:

The prices of used electric vehicles have leapt in response to the Government announcing its subsidy of up to $3,450 a car, the country's biggest secondhand car importers say.

Japan-based vehicle buyer Marcus Jones has emailed car importers with a somewhat sardonic update on pricing.

"I thought perhaps you could pass on thanks from the wives and orphans of Japanese EV and Phev owners," he wrote, "who have seen the auction values of their cars rise in the past few days by more or less the precise amount that the New Zealand taxpayer has generously agreed to contribute."

That a subsidy causes prices to rise should come as no surprise to anyone with a good understanding of basic economics. This point is illustrated in the diagram below, which assumes that the subsidy is paid to the buyers of EVs. [*] Without the subsidy, the market is in equilibrium, with a price of P0, and Q0 electric vehicles are traded. Introducing the subsidy, paid to the buyer, is represented by a new curve D+subsidy, which sits above the demand curve. It acts like an increase in demand, and as a result the price that producers receive for an electric vehicle increases to PP. The buyers pay that price, then receive the subsidy back from the government, so in effect they pay the lower price PC. The difference in price between PP and PC is the per-vehicle amount of the subsidy (which is up to $3450 per car, as announced by the government). The number of electric vehicles traded increases to Q1 (because buyers want to buy more electric vehicles because of the lower effective price they have to pay, and sellers want to sell more electric vehicles because of the higher price they receive). One thing to notice is that the price doesn't go up by the whole amount of the subsidy - the difference between the original price P0 and the new higher price PP is less than the per-vehicle subsidy (PP - PC).

Who gains from the subsidy? It turns out that both buyers and sellers do. Without the subsidy, the consumer surplus (the difference between the amount that buyers are willing to pay, and what they actually pay) is the area FBP0. With the subsidy, the consumer surplus increases to the area FEPC. Consumers are better off with the subsidy. Without the subsidy, the producer surplus (the difference between the price that sellers receive, and their costs) is the area P0BH. With the subsidy, the producer surplus increases to the area PPGH. Sellers are better off with the subsidy.

However, not all groups gain from the subsidy. The government has to pay it, and that comes with an opportunity cost. Perhaps the government has less money to spend on schools, or roads, or raising the pay of striking nurses. Or perhaps they borrow, in which case future generations have to pay it back through higher taxes or decreased services. The area that represents the amount of subsidy paid by the government is PPGEPC (it is the rectangle that is the per-vehicle amount of the subsidy (PP - PC) multiplied by the number of subsidised vehicles Q1). [**]

Now, we can consider who gets the most benefit of the subsidy. In simple terms, on the diagram you can see that the price rise for sellers (from P0 to PP) is greater than the price fall for buyers (from P0 to PC). Sellers benefit more from the subsidy. In fact, the sellers' share of the subsidy is the area of the subsidy above the original price - the area PPGFP0. The buyers' share of the subsidy is the area of the subsidy below the original price - the area P0FEPC. The sellers' share is much larger than the buyers' share.

That need not necessarily be the case. Notice that the supply curve is quite steep, much steeper than the demand curve. The supply curve is relatively more inelastic than the demand curve. That means that sellers are less responsive to a change in price than buyers are. It turns out that whichever side of the market is more inelastic gets the larger share of welfare gains (or losses) when there are changes in market conditions. In the diagram above, the sellers are more price inelastic, and so they receive the greater share of the benefits of the subsidy. The reverse could be true. If buyers were more price inelastic, they would receive the greater share of the benefits of the subsidy. This is shown in the diagram below (which retains all of the same labels as the previous diagram, but shows the case where supply is more elastic than demand).

Coming back to the Newsroom article, if the price of electric vehicles is going up a lot, then that suggests that the supply is more inelastic than the demand. In fact, if the price actually went up by the entire amount of the subsidy, that would suggest that supply is perfectly inelastic, that is, completely unresponsive to price changes. Not everyone is suggesting that the price is going up by the full amount of the subsidy (from the same Newsroom article):

[Robert Young, director of New Zealand's biggest used car importer Nichibo Japan] estimated about half the $3450 subsidy would end up off-shore, benefiting the auction vendors in Japan and the UK as well as new car manufacturers. More would go to GST – meaning Kiwi EV buyers would pocket only about one-third of the subsidy.

It's hard to see what supply of electric vehicles to New Zealand would be very inelastic compared with demand. There are other markets that Japanese second-hand car sellers could be selling to, including Australia, Thailand, Malaysia, the Indian subcontinent, and southern Africa (all areas that drive on the left). Receiving a higher price for selling into the New Zealand market should induce Japanese sellers to shift to selling their EVs to New Zealand instead of into those other markets. It is also hard to see why demand for EVs in New Zealand would be very elastic compared with supply, but many substitutes (including petrol- or diesel-powered vehicles, which are due to be taxed and become more expensive concurrently with the introduction of the EV subsidy) and the high cost of EVs would play a part. It wouldn't surprise me to learn that the subsidy is roughly evenly shared between buyers and sellers.

Anyway, the key point of this post is that this is somewhat futile (again from the same Newsroom article):

But Transport Minister Michael Wood said the Government was keeping a close eye out for any attempts to take advantage of the subsidy.

“The new and imported used vehicle market is very competitive and I’m sure anyone attempting to distort market pricing will be called out," he said.

It's not the sellers that are distorting the market pricing, it's the subsidy.

[HT: Eric Crampton at Offsetting Behaviour]


[*]  The subsidy could be paid to the sellers instead of the buyers. However, it turns out that the price and welfare effects would be exactly the same, regardless of who it is paid to. The only difference would be in terms of the transaction costs (the costs of administration of the subsidy). There is an argument that it would cost less to pay the subsidy to EV sellers, because there are fewer of them, and so fewer payments would need to be made. However, a canny government would realise that not every buyer would claim back the EV rebate, and so paying the subsidy to buyers in the form of a rebate may turn out to be cheaper overall. And even if it doesn't, it looks better politically for the government to pay the subsidy to 'ordinary car buyers' than to 'millionaire car salespeople'.

[**] For completeness, adding the consumer surplus and producer surplus together, and subtracting the subsidy, gives us a measure of total welfare (or total surplus). Without the subsidy, total welfare is the area FBH. With the subsidy, total welfare decreases to the area (FBH - BGE). The area BGE is the deadweight loss of the subsidy. However, this assumes that there are no positive externalities associated with electric vehicles, which there probably are - a person buying an EV is a person not buying a carbon-powered vehicle, and so each EV sold reduces carbon emissions (and reducing a negative externality is the equivalent of a positive externality).

Sunday, 18 July 2021

Book review: Range

Should you aim to specialise in a single domain, developing deep expertise on a specific topic? Or aim for breadth, developing understanding of a wide range of domains? In David Epstein's view, we should be aiming for the latter, and that is the argument that he puts forward in his 2019 book, Range. The poster child for (and possibly, against) the idea of specialising early and deeply is Tiger Woods, who Epstein contrasts with Roger Federer in the first chapter. I hadn't realised the breadth of Federer's sporting experience as a youth, and that he came to tennis rather late. He seems to have turned out all right, so is a good place to start.

The book is wide-ranging and somewhat difficult to abstract, but here is what Epstein writes in the conclusion:

The question I set out to explore was how to capture and cultivate the power of breadth, diverse experience, and interdisciplinary exploration, within systems that increasingly demand hyperspecialization, and would have you decide what you should be before first figuring out who you are.

So, a part of Epstein's argument in favour of breadth over depth is to develop a range of experiences. His own journey is a good example:

When I was seventeen and positive that I was going to go to the U.S. Air Force Academy to become a pilot and then an astronaut...

But I never did any of that. Instead, at the last minute I changed my mind and went elsewhere to study political science. I took a single poli-sci class, and ended up majoring in Earth and environmental sciences and minoring in astronomy, certain I would become a scientist. I worked in labs during and after college and realized that I was not the type of person who wanted to spend my entire life learning one or two things new to the world, but rather the type who wanted constantly to learn things new to me and share them. I transitioned from science to journalism...

I have to say that I should be very much in favour of Epstein's argument, as my own path to economics was long and winding. I too wanted to become an Air Force pilot, but failed to get into the final intake of the RNZAF fighter pilot programme. I went to university, and started initially in physics and maths, then changed in my second year to chemistry and materials science, before dropping out. I have variously worked as a lab technician, a labourer, an accountant, and in hospitality. I finally went back to university to study accounting, but before the end of my first year back I had decided to switch to economics and strategic management, before later dropping strategic management in favour of a focus on economics. However, to say that I 'focused' would probably be overstating the case, and one look at my academic CV would reveal the range of research projects in health economics, population economics, and development economics, that I have been involved in. Unlike most economists, I haven't gone deep on any particular topic, and I guess that is also reflected in the wide range of topics that I post on in this blog.

However, in spite of being somewhat predisposed to Epstein's arguments, I found myself not entirely convinced. He presents a large collection of anecdotes, supported by lots of references to research, and yet I didn't find it compelling. In part, that's because there are really two separate (but related) arguments that Epstein is really making. In the early part of the book, he argues for more breadth for individuals. That is, he argues that people should err towards becoming generalists rather than specialists, or at least they should delay specialisation until they have had a chance to test out many different paths. Later in the book, Epstein switches to advocating for diversity of teams. That is, that teams should be comprised of a variety of viewpoints and should develop norms of challenging group-think. The two parts of the book are clearly related, but ironically by failing to specialise the book on one or the other (and preferably the first argument), I think Epstein doesn't advance the arguments as much or as thoroughly as he could. IN particular, in relation to teams there is a whole research literature on ethnic and gender diversity of teams and viewpoint diversity that would have added significantly to the argument, but was not considered.

Despite that gripe, I did enjoy the book, and there are some important implications for education. In particular, if there are advantages to late specialisation, that suggests that making university students elect for a major at the start of their degree is a mistake (which is something that I have argued), and may hamper the development of their future careers. More breadth within majors would also seem to be useful, and on that point I thought this part of the book, which describes some research undertaken by New Zealand-based political researcher James Flynn (famous for the Flynn effect) on students at top state universities in the U.S.:

Each of twenty test questions gauged a form of conceptual thinking that can be put to widespread use in the modern world. For test items that required the kind of conceptual reasoning that can be gleaned with no formal training - detecting circular logic, for example - the students did well. But in terms of frameworks that can best put their conceptual reasoning skills to use, they were horrible. Biology and English majors did poorly on everything that was not directly related to their field. None of the majors, including psychology, understood social science methods... Econ majors did the best overall. Economics is a broad field by nature, and econ professors have been shown to apply the reasoning principles they've learned to problems outside their area.

I'll take being the best of a bad bunch as a moral victory for economics. However, it is clear that education needs to be re-considered. Another thing that comes in for some criticism is the difficulty that interdisciplinary research has in getting funding. Epstein observes that research increasingly involves asking for funding for research where the answer is already known before the research begins. That isn't a recipe for advancement, or the type of serendipitous discoveries that underlie a surprising number of Nobel Prizes.

Anyway, the book is a good read. And interestingly, just as I was finishing up reading it, my attention was drawn to this new article published in the journal Perspectives on Psychological Science, which concludes that:

...(a) adult world-class athletes engaged in more childhood/adolescent multisport practice, started their main sport later, accumulated less main-sport practice, and initially progressed more slowly than did national-class athletes; (b) higher performing youth athletes started playing their main sport earlier, engaged in more main-sport practice but less other-sports practice, and had faster initial progress than did lower performing youth athletes; and (c) youth-led play in any sport had negligible effects on both youth and adult performance.

So, there is definitely something to be said for breadth over depth, in science and in sport.

[HT: Marginal Revolution, for the article]

Saturday, 17 July 2021

Yet another contingent valuation debate

There is something about contingent valuation as a methodology that seems to generate seemingly endless debates in the research literature (see here and here, for example). The contingent valuation method is a type of non-market valuation - a way of valuing goods (and services) that are not (and cannot be) traded in markets. For example, it can be used to value environmental goods (e.g. how much is improved water quality in a stream worth?) or more general public goods (e.g. how much is another bridge across the Waikato River worth?). Essentially, contingent valuation involves presenting research participants with a series of hypothetical choices in order to determine how much they would be willing to pay for the good or service (it is what we refer to as a stated preference method). For instance, you might use a survey that asks whether people would be willing to pay an additional $5 in council property taxes to improve water quality in a particular stream (or $10, or $50, or $500), and then use that data to work out how much (on average) people are willing to pay for improved water quality.

One of the challenges in contingent valuation is whether the payment is expressed as a single one-off payment (e.g. pay $100 once and the water quality will improve) or as a series of regular (e.g. annual) payments (e.g. pay $25 per year and the water quality will improve). The latest debate I read recently addresses this important question.

First, this 2015 article by Kevin Egan (University of Toledo), Jay Corrigan (Kenyon College), and Daryl Dwyer (University of Toledo), published in the Journal of Environmental Economics and Management (sorry, I don't see an ungated version online). Egan et al. argue that there are three reasons that annual payments should be preferred rather than one-off payments, for three reasons:

First, survey respondents are spared from performing complicated present value calculations. When respondents compare their known annual WTP... to a proposed annual payment, discount rates cancel out in their benefit-cost analysis. When asked to make a one-time payment for a long-lasting environmental improvement, respondents must know their personal discount rate and perform the relevant present value calculation, while having complete fungibility of their income across time and no binding budget or liquidity constraints... Second, we conduct a convergent validity test by comparing CV estimates from surveys with one-time and ongoing annual payments to annual consumer surplus estimates from a travel cost analysis. We demonstrate that CV estimates from surveys with ongoing annual payments better match annual travel cost consumer surplus estimates. Third, a behavioral argument based on mental accounting... suggests that survey respondents who mentally set aside a fixed annual dollar amount for charitable giving will feel more constrained by a large one-time payment compared to a relatively small annual payment.

In other words, annual payments work better because they are easier for respondents to interpret, they are more consistent with revealed preference results based on actual behaviour, and they are consistent with a mental accounting story that notes that a one-off payment has an outsize effect on people's reported willingness-to-pay (WTP), compared with smaller annual payments. They go on to support their results with data from a contingent valuation survey of 967 people, investigating their willingness to pay for water quality improvements at Maumee Bay State Park in Ohio.

Enter the second article, by John Whitehead (Appalachian State University), also published in the Journal of Environmental Economics and Management, but in 2018 (ungated earlier version here). Whitehead attacks the original article on the basis of the convergent validity tests. Specifically, looking at the proportion of survey respondents who were willing to pay at different price levels, he notes that as the price increases a smaller proportion should be willing to pay that amount (because demand curves are downward sloping), but that isn't always the case. Although Egan et al. employ a method that allows them to deal with this issue (by merging parts of the sample where the downward sloping demand curve assumption would be violated), Whitehead shows that an alternative method demonstrates that both one-time payments and annual payments have convergent validity, and he concludes that there is no reason therefore to prefer annual payments.

In the third article, published in the same issue of the Journal of Environmental Economics and Management (ungated earlier version here), Egan et al. respond. They extend their earlier analysis and demonstrate convergent validity using different estimators and a variety of sensitivity analyses. However, in order to demonstrate this validity, they first merge data on two different types of annual payments: a perpetual payment (one that would go on forever) and an annual payment for a fixed period of ten years. 

Finally, Whitehead follows up with a further re-analysis in this article, published in Econ Journal Watch (in 2017, weirdly before the previous two articles appeared officially in print - I guess JEEM had had enough of the debate and declined to publish this rejoinder). He concludes that:

Whenever bids are pooled using dichotomous choice data, the researcher implicitly acknowledges that something went wrong with the execution of the study or with the contingent valuation method itself. It might be that (a) bid subsamples are too low to generate enough power to conduct the statistical test, (b) bids are poorly designed (too close together, too far apart, or there is inadequate coverage of the range of WTP), or (c) contingent valuation method respondents are highly inconsistent... Data cleaning and pooling bids should not be considered a valid research method when the research goal is conducting validity tests over payment schedules or any other issue in the contingent valuation method.

Whitehead's criticism is quite valid. Manipulation of data to enable an analysis may be fine when a lower-bound (or upper-bound) estimate is all that is required. When we are testing the validity of a particular method, the standard of proof is somewhat higher. However, along the way the debate lost sight of two things. First, Egan et al. merged two different annual payment types in their broader analysis. They had earlier demonstrated in the convergent validity tests that perpetual annual payments resulted in similar WTP estimates as the travel cost method. To me, they really showed that perpetual payments should be preferred over a fixed number of annual payments, and over one-off payments. When they merged the two annual payment types together, that nuance was lost. Second, convergent validity was only one of the three reasons that Egan et al. originally proposed for why annual payments should be preferred. The debate focused on only one of the reasons, but the other two remain valid.

Overall, my takeaway is that more follow-up research is definitely required on the convergent validity question, but in the meantime it is likely that we should prefer perpetual annual payments in contingent valuation surveys. Whitehead argues that we should use both a one-time payment and annual payments. Given the potential for survey respondent fatigue and confusion, I wouldn't favour that approach. Contingent valuation solves a real problem - valuing environmental and other goods that are not traded in markets. However, we need to recognise that, like any method, it has its limitations.

Read more:

Friday, 16 July 2021

The ethnographic atlas isn't 'tabulated nonsense'

I've seen a number of papers over the years that have made use of data from George Murdock's Ethnographic Atlas (see here for a gated summary, or here for the data), often as an instrument for some other variable (as one example, see here). The Atlas summarises ethnographic data from over 1200 pre-industrial societies, including a variety of characteristics such as political organisation, social organisation, norms, and agricultural practices. A search on Google Scholar reveals that it has been cited over 6700 times, so it is widely used.

Given widespread use of the Ethnographic Atlas , I was very interested when I saw the title of this new article by Duman Bahrami-Rad, Anke Becker, and Joseph Henrich (all Harvard University), published in the journal Economics Letters (ungated earlier version here): "Tabulated nonsense? Testing the validity of the Ethnographic Atlas". It turns out that I didn't need to worry too much, and nor should researchers using data from the Ethnographic Atlas. Bahrami-Rad et al. compare data from the Ethnographic Atlas with comparable variables from more recent Demographic and Health Surveys for the same ethnic groups. They find:

...positive associations between the historical information reported by ethnographers and the contemporary information reported by a large number of individuals. Importantly, the associations between historical ethnicity-level measures and contemporary self-reported data do not only hold for dimensions that would have been easy to observe for an ethnographer, such as how much a society relies on agriculture, or whether marriages are polygynous. Rather, they also hold for dimensions that are more concealed, such as how long couples abstain after birth, or whether people prefer sons.

Clearly, no cause for concern. The title of the paper is clickbait (the quote "tabulated nonsense" is attributed to the British anthropologist Sir Edmund Leach), and clearly effective, since it got me to read the paper.

Thursday, 15 July 2021

The PBRF has served its purpose and it's time for a re-think

New Zealand's Performance Based Research Fund (PBRF) has been going through a review over the last year or so (see here, or read the review's discussion document here). For those of you not in the know, the PBRF allocates some proportion of government university funding to each university on the basis of research performance. It is essentially a ranking exercise, undertaken by evaluating individual researchers (which is quite different from the UK or Australia, where the unit of assessment is the department), and then aggregating up to disciplines and to each university as a whole. It provides legitimacy to the claim that Waikato is number 1 in economics.

With the review underway, this article by Bob Buckle, John Creedy, and Ashley Ball (all Victoria University of Wellington), published in the journal Australian Economic Review (ungated earlier version here), is particularly timely. They looked at how the three previous full PBRF rounds (in 2003, 2012, and 2018, and ignoring the partial assessment round that occurred in 2006) affected New Zealand universities. Specifically, they look at the incentive effects, noting that:

A university can improve its research quality in three ways, although strong constraints are placed on the changes that can be made. Changes in average quality depend on the exits and entries of individuals (to and from other universities in New Zealand, or international movements), and the extent to which remaining individuals can improve their measured quality. A university can also influence its average quality by changing its discipline composition in favour of higher-quality groups.

They use confidentialised data from the Tertiary Education Commission on every researcher included in the three PBRF rounds to date, and focus on changes between 2003 and 2012, and between 2012 and 2018. Overall, they find some positive impacts on research quality:

There was a rise in the proportion of As and Bs in both periods... the proportion of A‐quality researchers increased by a factor of 2.5 between 2003 and 2018: from 6.5 per cent in 2003 to 13.3 per cent in 2012 to 16.4 per cent in 2018. The proportion of Bs increased by a factor of 1.5: from 25.9 per cent in 2003 to 39.6 per cent in 2012 and to 41 per cent in 2018.

A-quality researchers are world class in their fields, so an increase in researchers in that category is clearly a good thing. However, it does matter how the universities got there, and on this point Buckle et al. find that:

The net impact of exits on the AQSs [Average Quality Scores] is positive for all universities and disciplines in both periods; the net impact of entrants is always negative; and the net impact of QTs [Quality Transformations] is always positive.

In other words, the average quality of academics who exited the industry was lower than the average of those who stayed, while the average quality of new entrants (often early career researchers) was lower than the average of those already in the universities, and there was an improvement in measured quality among those who stayed. Looking at whether changes in individual researcher quality within disciplines, or changes in the composition of disciplines, contributed more to the improvement in measured quality overall, Buckle et al. find that:

For all universities combined, for 2003–12, the decomposition method found that all the increase in the AQS came from researcher quality changes. During the second period, the contribution arising from quality improvement was by far the dominant influence: indeed, the overall AQS would have been 3 per cent higher in the absence of any change in the overall discipline composition.

In other words, more of the improvement in AQS came from improvements in individual researcher quality scores, and not from universities making opportunistic changes to their disciplinary structures. 

So, overall, PBRF has had some good effects, and avoided the worst potential incentive effects for universities (at least at the disciplinary level). However, there are limits to how far those changes can be pushed. As Buckle et al. note in their conclusion:

The substantial reduction in the rate of quality improvement during the period 2012–18, compared with the earlier period 2003–12, suggests some streamlining of the process may be warranted, particularly in view of the high compliance and administrative costs. Furthermore, the major contribution to the average quality improvement in all universities and disciplines has resulted from the large number of exits of lower‐quality researchers. This also suggests that the extensive process, and information required, used to distinguish among higher‐quality categories is no longer necessary.

The PBRF is immensely time-consuming and costly, both to individual researchers who need to spend a lot of time preparing and polishing individual portfolios, and to universities. Researcher quality has improved tremendously in New Zealand since the PBRF was introduced in the early 2000s. The low-hanging fruit has clearly been picked, and the non-performing researchers are mostly gone. It is time to re-consider whether the PBRF remains useful, especially in terms of an assessment of individual researcher quality. Buckle et al. stopped short of conducting a full cost-benefit analysis of the PBRF system, but it is time that someone followed through and completed that exercise.

Wednesday, 14 July 2021

Iceland and the four-day workweek

Iceland has been in the news over the last couple of weeks for the success of a trial of a four-day workweek (see here and here, for example). As the BBC reported:

Trials of a four-day week in Iceland were an "overwhelming success" and led to many workers moving to shorter hours, researchers have said.

The trials, in which workers were paid the same amount for shorter hours, took place between 2015 and 2019.

Productivity remained the same or improved in the majority of workplaces, researchers said.

I took these reports at face value, while noting that my concerns about the Perpetual Guardian trial in New Zealand, especially in relation to the Hawthorne effect, remain valid. However, it appears that there has been some misreporting of what was actually trialled in Iceland. Anthony Veal (University of Technology Sydney) wrote in The Conversation today:

It almost seems too good to be true: a major trial in Iceland shows that cutting the standard five-day week to four days for the same pay needn’t cost employers a cent (or, to be accurate, a krona).

Unfortunately it is too good to be true.

While even highly reputable media outlets such as the BBC have reported on the “overwhelming success” of large-scale trials of a four-day week in Iceland from 2015 to 2019, that’s not actually the case.

The truth is less spectacular — interesting and important enough in its own right, but not quite living up to the media spin, including that these trials have led to the widespread adoption of a four-day work week in Iceland...

The media reports are based on a report co-published by Iceland’s Alda (Association for Democracy and Sustainability) and Britain’s Autonomy think tank about two trials involving Reykjav√≠k City Council and the Icelandic government. The trials covered 66 workplaces and about 2,500 workers.

They did not involve a four-day work week. This is indicated by the report’s title – Going Public: Iceland’s journey to a shorter working week...

Read on to the third paragraph and you’ll learn the study “involved two large-scale trials of shorter working hours — in which workers moved from a 40-hour to a 35- or 36-hour week, without reduced pay”.

A four-day week trial would have involved reducing the working week by seven to eight hours. Instead the maximum reduction in these trials was just four hours. In 61 of the 66 workplaces it was one to three hours.

Extrapolating from the effect of a reduction of 1-3 hours (for the majority of employers in the trial) to a reduction of 8 hours may be a bit of a stretch. Veal also notes the potential for the Hawthorne effect - workers know that they are involved in a trial, and that researchers (and their employers) are watching them closely. It is only natural that they would work a little harder, and this would manifest in higher productivity.

It may be too early for this (from the BBC article):

The trials led unions to renegotiate working patterns, and now 86% of Iceland's workforce have either moved to shorter hours for the same pay, or will gain the right to, the researchers said.

At least though, we can watch with interest how a larger scale shift to shorter working hours affects productivity in Iceland. However, to understand that question we would need to agree on what we mean by productivity.

In general, labour productivity is the economic output per unit of labour input. In this case, it really matters how you measure labour input. If you measure it in terms of hours of labour, reducing the workweek for all workers might increase productivity, while at the same time reducing the total output of the economy. That arises simply because of diminishing marginal returns to labour. Each worker is progressively less productive each hour that they work than they were the hour before (maybe tiredness or boredom are factors here). So, removing the least productive hours from the workweek will raise average productivity, but would still mean that less work gets done in total.

However, if you measure labour input in terms of the number of workers (or equivalent full-time workers, even adjusting for the change in definition of 'full-time'), then labour productivity will decrease. Unless you genuinely believe that there is negative marginal product from the hours that are being cut, in which case the employers are irrational and should have cut hours long ago, without any need for government intervention (why would an employer pay a worker for hours that decrease their total output?).

There are other salient issues that I don't think have been adequately canvassed on this topic. For instance, in jobs where there are significant tournament effects, there might be no decrease in productivity measured per worker. That's because when there are tournament effects, people are paid a 'prize' for their relative performance (that is, for winning the 'tournament'). The prize may take the form of a bonus, a raise, or a promotion. The point is that each worker only needs to be a little bit better than the second best worker in order to 'win' the tournament. Those incentives would work to undo the decrease in work hours, since if everyone else reduces their work hours from 40 to 32, a worker that keeps working 40 hours will increase their chances of winning the tournament. If you doubt that tournament effects are real, I recommend asking any serious academic how many hours they work each week (since tournament effects are rampant in academia). This is not consistent with the overall goal of the four-day workweek, which is to reduce work. All it would do in these occupations is shift more of the work to outside of the paid workweek.

Also, the four-day workweek may be great for employees, but how will it affect self-employed workers? Or 'contractors', who are nominally self-employed but have little control over their work conditions. Or workers in the gig economy? Or interactive service workers (e.g. baristas), where the potential productivity gains (measured per hour worked) are likely to be close to zero? Will governments need to adjust the minimum wage (which is expressed in hourly terms, not weekly terms)? Does this reduce holiday entitlements (which are generally expressed as a number of weeks, but each week is now four days, not five)? That isn't to say that any of these issues is fatal for a four-day workweek proposal, only that they are things that any government will need to think about before such a proposal goes ahead.

Read more: