Thursday, 30 December 2021

A slightly more optimistic take on online learning during the pandemic

Regular readers of this blog will know that I am quite sceptical of the efficacy of online teaching and learning, and I have previously written about the evidence coming out of studies during the pandemic (see here, and for more see the list of links at the bottom of this post). My take on the evidence so far is that we need to carefully evaluate the changes that the pandemic has forced on teaching, in relation to the move to online. So, I was interested to read this recent article by George Orlov (Cornell University) and co-authors, published in the journal Economics Letters (ungated earlier version here).

Orlov et al. evaluate the impact of the pandemic-enforced shift to online teaching on learning in seven intermediate-level economics courses at "four R1 PhD-granting institutions" (translation: top US universities). Their study has the advantage of being based on low stakes post-test evaluations that use the same questions, implemented in the Spring or Fall semester of 2019 (i.e. before the pandemic, when teaching was in-person), and again in the Spring semester of 2020 (i.e. where part of the semester was affected by the pandemic and learning had shifted online). In relation to the teaching, they note that:

Six of the seven classes were taught synchronously during the remote instruction period with lectures delivered using Zoom. The seventh instructor pre-recorded lectures and spent the scheduled class time in Zoom answering student questions about the material.

So, that pretty much reflects practice across all universities during this time. Because the pandemic affected only part of the Spring 2020 semester, Orlov et al. are able to isolate its effect on the topics that were taught remotely, as well as look at the effect overall (including topics that were taught online). Based on their sample of 809 students (476 pre-pandemic, and 333 in the pandemic-affected semester), they find that:

...in the pandemic semester, the overall score drops by 0.185SD (p = 0.015) while the remote subscore drops by 0.096SD (p = 0.181). A possible explanation for the discrepancy is that these scores measure learning of topics taught closer to the administration of assessments, which potentially would be fresher in students’ memory. Furthermore, at the institutions in this study, there was an extended break (up to three weeks) before the remote portion of the semester started. Overall, these results suggest that student outcomes did suffer in the pandemic semester and the magnitudes of the declines in learning were not trivial.

So far, so not good for student learning. Looking at differences across subgroups (gender, race, first-in-family students, students without English as a first language), Orlov et al. find no statistically significant differences. Then, looking at teaching factors, they find:

 ...evidence that instructor experience and course pedagogy played important roles in ameliorating the potentially negative effects of the pandemic on learning. When the instructor had prior online teaching experience, student scores were significantly higher overall (0.611SD, p = 0.074) and for the remote material (0.625SD, p = 0.000). Students in classes with planned student peer interactions earned scores that were similar relative to students in other classes on the overall scores and 0.315SD higher (p = 0.040) for the material taught remotely.

That suggests that, as teachers gain more experience with online teaching, the negative impacts may reduce. However, even more importantly, it suggests a method of teaching that may help. Orlov et al. defined peer interactions as:

...the use of at least two of the following strategies: 1) classroom think-pair-share activities... 2) classroom small group activities, 3) encouraging students to work together outside class in pre-assigned small groups, and 4) allowing students to work together on exams.

These sorts of interactive learning approaches are slightly more difficult to execute in the synchronous online environment (and unlikely achievable in an asynchronous format), but it appears that they can be done well, and contribute to student learning (or, at least, reduce the negative impact of online learning). However, we shouldn't read too much into this study. While they did sub-group analysis, the study was based on a sample from top US universities, which probably doesn't extrapolate well to lower-ranked institutions, where students may be less motivated or engaged. And, the teaching methods were not randomly assigned, so we can't attribute causal explanations to the differences. Clearly, we still need more work in this area.

Read more:

Wednesday, 29 December 2021

The impact of study abroad on student language skills and personality traits

When we revamped the Bachelor of Management Studies with Honours in 2018, one of the things we introduced was a greater focus on international engagement. As I noted in this post in 2019, the international study tour was an integral (and unique in the New Zealand context) component of the new degree structure. The importance of having more than a passing appreciation of the global context to modern management students (and other students) seems obvious (and taken to an extreme in the case of the original proposal for Minerva University), but is there evidence to support the benefits of such an international approach?

This 2018 article by Silvia De Poli, Loris Vergolini, and Nadir Zanini (all Fondazione Bruno Kessler), published in the journal Applied Economics Letters (sorry, I don't see an ungated version online), provides some suggestive evidence in favour of study abroad. De Poli et al. make use of data from the MOS-4 programme implemented in the Italian province of Trento in 2012, which randomly selected applicants who had just completed their fourth year of high school to complete a four-week intensive English language course in the UK or Ireland. Interestingly, De Poli et al. looked at how the programme affected academic performance, as well as personality traits of the students. Comparing students who were randomly selected with those who weren't, they find that:

...the programme, in the short term, had a positive and significant effect on five of the eight personality traits considered: self-confidence, adaptability, social orientation, willingness to communicate and openness... On the other side, the programme has no effect on other dimensions related to ‘global self-esteem’, defined as the objective assessment of their skills and the acceptance of their quality...

Regarding the improvement of English language skills, we found that participants achieved a better score in the post-programme test as well as enjoying higher school achievement one semester after returning to school... However, with respect to other subjects (e.g. humanities and sciences), participants’ achievement, measured one semester after completion of the programme, did not improve.

However, not all students appeared to have benefited equally. As I have noted in relation to online learning (most recently here), the more engaged and high ability students benefited the most. Specifically, De Poli et al. found that:

...high attainers appeared to have benefitted more from the programme, as they achieved better results in English language in their first semester at school after completion of the programme...

So, as I said earlier, we can take this as suggestive evidence in favour of study abroad. This study had the advantage of randomisation, which is rarely a feature of study abroad programmes, where self-selected students go to (often) self-selected countries to study abroad. That reduces the chance that we can genuinely determine the effect of study abroad due to selection bias. What we really need now is a randomly assigned study abroad programme at the university level, and one that isn't simply focused on the acquisition of host country language skills.

Tuesday, 28 December 2021

The changing drinking norms among young people

I've done a fair amount of research on alcohol consumption and behaviour in the night-time economy (see here, and here, and here, with more research accepted for publication that I will blog about later). That research was based on fieldwork conducted in 2014 and in 2019 in the Hamilton CBD. The one thing we most noticed about how nightlife had changed over that five-year period was how much quieter things were in 2019 than in 2014. The people who were drinking were just as intoxicated in 2019 as in 2014, but there were fewer of them. The decline was especially noticeable on Thursday and Saturday nights.

Such a rapid change in the nightlife came as a real surprise to us. But it may simply reflect a generational change in drinking behaviour, as outlined in this 2020 article by Rakhi Vashishtha (La Trobe University) and co-authors, and in this article in The Conversation this week, by Sarah MacLean (also La Trobe University) and co-authors. MacLean et al. first note that:

Young people in Australia, the UK, Nordic countries and North America have, on average, been drinking significantly less alcohol than their parents’ generation did when they were a similar age.

They then outline a number of possible explanations for this generational shift, including:

...uncertainty and worry about the future, concern about health, changes to technology and leisure, and shifting relationships with parents.

In relation to uncertainty, MacLean et al. note that:

A couple of decades ago, getting really drunk was widely regarded by many young people as a “rite of passage” into adulthood and a good way of taking time out from the routines of work and study.

Now, young people feel pressure to present as responsible and independent at an earlier age and some fear drinking to intoxication, and the loss of control it entails, will jeopardise their plans for the future.

Although, people who do drink may still do stupid things, especially if they suffer from FoMO.

On health concerns, MacLean et al. write:

Health and well-being also seem to be increasingly important to young people.

Research from 15-20 years ago found young people viewed the consequences of heavy drinking (vomiting, unconsciousness) positively, or at least ambivalently.

More recent studies suggest this has changed, with young people expressing concerns about risks to mental health and long-term physical health related to their alcohol use.

However, Australian and Swedish research also found some young people regard the social benefits of drinking as important to their well-being.

For many young people, however, this seems to involve moderate alcohol consumption, in place of the “determined drunkenness” observed in the 1990s and early 2000s.

 In relation to changes in technology, MacLean et al. note that:

Technology has reshaped how young people socialise, with contradictory effects on youth drinking.

Social media provides new (less regulated) avenues for alcohol companies to promote their products. Holding a drink is de rigueur for a photo on social media celebrating a night out.

Yet, young people are also careful to manage their online images...

Our research found young people worry about who might see images of them drunk on social media (such as friends, family and future employers), a risk that is unique to this generation.

Finally, on relationships with parents:

Young people also spend more time with their parents, potentially developing more communicative relationships that reduce their need to drink and rebel.

It is interesting that there has been such a change in drinking behaviour among young people over a very short time. MacLean et al. have raised some interesting explanations for why, which may also apply to New Zealand (see this post for more on the shift in behaviour in New Zealand). It would be interesting to see what the relative contributions of these different explanations are to the change in behaviour, and indeed if there are other factors at play as well.

Read more:

Thursday, 23 December 2021

Book review: Applied Economics (Thomas Sowell)

Last month, I reviewed Thomas Sowell's 2011 second edition of Economic Facts and Fallacies. This month, I read the 2009 revised edition of his book Applied Economics. Like the other book, this one is written in a clear and easy-to-read style, and supported by empirical evidence (not Sowell's own evidence, but evidence gathered from other research). However, there is a fair amount of overlap between the two books in terms of topic coverage, with discrimination and development in particular seeming to cover the same ground in both books. The 'hook' in this book is different though, and captured by the subtitle "Thinking beyond stage one". When I read that, I was thinking it meant beyond stage one economics - that is, intermediate economics in a standard progression through an economics major. Instead, Sowell refers to the tendency for politicians, policy makers and others to only think of the short-run first-order effects of a policy change, and not to work through subsequent effects that will occur in the medium-run or long-run. It is over these longer timescales that most unintended consequences become apparent, but for a politician in particular, negative impacts that occur after they are no longer in office will be of little consequence to them right now. As one example:

...a small-time criminal may find it expedient to kill some local store owner for the small amount of money in the store's cash register, if only to keep the store owner from identifying him, even though this might make no sense in organized crime.

Public outrage at such a murder could result in more law enforcement activity in the area, reducing the profitability of the crime syndicate's business in illegal drugs, prostitution, and other activities by making local customers more hesitant to engage in such activities when there was an unusually large police presence in their neighborhoods.

In this example, the small-time criminal isn't thinking beyond stage one, unlike crime syndicates that have strong incentives to do so. There are many similar examples that outline the consequences of failing to think beyond stage one. I feel like Sowell does a much better job of keeping on point in this book. However, like the other book, this one is a little uneven in places. For example, at one point he carefully explains why low American life expectancy cannot be taken as evidence of a poor health care system in the US, and then a few pages later uses declining life expectancy in the former Soviet Union as evidence of poor health care. There is also much that some readers will disagree with, in particular Sowell's views on the integration of immigrants.

Despite that, there is also a lot of really good material in the book. I especially enjoyed the section on prejudice, bias, and discrimination, and especially Sowell's explanations of the costs of discrimination. Many researchers and policy makers could do well to read that section. There is also more humour in this book. For example:

Where the issue is the safety of nuclear power plants, for example, the answer to the question whether nuclear power is safe is obviously No! If nuclear power were safe, it would be the only safe thing on the face of the earth. This page that you are reading isn't safe. It can catch fire, which can spread and burn down your home, with you in it. The only meaningful question, to those who are spending their own money to deal with their own risks, is whether it is worth what it would cost to fireproof every page in every book, magazine, or newspaper, not to mention paper towels, stationery, notebooks and Kleenex.

Of the two Sowell books that I have read over the last month or so, this one is clearly the better of the two. Given the overlap in content, and the fact that this one seems to keep on point a bit more, I would recommend Applied Economics, rather than Economic Facts and Fallacies, to readers interested in getting a clear-eyed (but not necessarily entirely agreeable) perspective on a range of issues from an economics perspective.

Wednesday, 22 December 2021

Uber and alcohol-related traffic fatalities

Earlier this year, I wrote a post about the relationship between UberX and alcohol consumption, based on this recent article. The key takeaway was that the presence of UberX was associated with an increase in alcohol consumption. As I wrote then:

Decreasing the full cost of drinking at bars, by making it less expensive to travel to and from bars, increases alcohol consumption. That is likely to be associated with increased harm, including harm to health from over-intoxication, or violence. However, before we conclude that UberX has an overall negative effect, we need to consider that the availability of UberX might also offset some harm by reducing the incidence of drunk driving. Teltser et al. didn't look at the effects of UberX on harm, or on drink driving, which is something we would need more information on before drawing a firm conclusion.

Does Uber increase, or decrease, alcohol-related traffic accidents? That is essentially the question addressed by this recent NBER Working Paper by Michael Anderson and Lucas Davis (both University of California, Berkeley). They match normalised rideshare data [*] obtained directly from Uber with fatal traffic accident data from the Fatality Analysis Reporting System (FARS), from July 2012 to January 2017.

Interestingly, Anderson and Davis first show that the 'event study' extensive margin analysis that has been used by many other studies (which essentially use a difference-in-differences approach to compare accidents before and after the introduction of Uber, like the study linked above) leads to highly inconsistent and non-robust results. Most studies have used that approach because they lack access to actual rideshare data. Instead, Anderson and Davis look at the relationship between the (normalised) number of Uber trips and traffic fatalities, and find that:

A one unit increase in ridesharing activity reduces the probability of an alcohol-related fatal crash by 0.038 percentage points (t = -3.2). This corresponds to approximately a 4.8% decrease in alcohol-related fatalities.

Anderson and Davis then show that the effect is concentrated during nights and weekends (as you would expect), and that the results extend to all fatal crashes, in addition to the alcohol-related fatal crashes that their main analysis is based on. Based on some back-of-the-envelope calculations, they show that:

Our estimates of the effects on alcohol-related fatalities imply that Uber saved 214 lives in 2019, or a reduction of approximately 6.1%.

As noted in my earlier post, driving and using Uber are substitutes. The presence of Uber should reduce the instance of driving, including drink-driving. This appears to be borne out by the analysis of Anderson and Davis. However, their analysis falls a bit short of convincing causal evidence, lacking exogenous variation in the number of rideshare trips. And, it would be interesting to know what effect (if any) omitting New York (in particular) and Seattle had on the analysis. Take this as suggestive evidence for now of the positive effects of Uber (in contrast with the negative effects on traffic congestion).

[HT: Richard Holden at The Conversation]

*****

[*] The normalization process leads to data that is expressed as a proportion of the number of rides originating in a specific Census tract in San Francisco. They have to omit Seattle and New York from their data, because the actual number of Uber rides in those cities are published, which would allow them to back out the number of rides in every other Census tract.

Read more:

Monday, 20 December 2021

Three papers on the gender gap in economics seminars

The culture of the economics profession has attracted a lot of attention of late (see this post, or the list of posts at the end of this one for more on the gender gap in economics more generally). Through all of this attention, there have been some interesting and important changes underway (see this post, for example). A reasonable question, then, is whether things are improving.

Three recent papers may provide some early indications, in relation to economics seminars. The first is this NBER Working Paper by Pascaline Dupas (Stanford University) and co-authors. They systematically collected data from 460 economics seminars and job market talks across 32 top universities from January to May of 2019, as well as presentations at the NBER Summer Institute. So, these data don't tell us much about changes over time, but may provide a useful baseline against which to compare progress. Importantly, their data focuses on the questions and interruptions that speakers face, which is a key aspect of the climate of seminars and presentations (see this post, for example). Overall, Dupas et al. find from the seminar and job market talks data that:

On average, roughly 26 questions are asked during a regular economics seminar and 35 questions are asked during a job market talk. For a 90-minute seminar, this represents one interruption every 3.5 and 2.5 minutes, respectively - although interruptions are not uniformly distributed during the time allotted. Moreover, there is considerable heterogeneity with the number of questions ranging from a low of 5 to a high of 69 for any given seminar. There are 3.6 times as many questions from men as from women during regular seminars - and 7.6 times during job market talks - despite men only outnumbering women roughly 2 to 1 in attendance (and 3 to 1 in job talks)...

In terms of the type of questions, roughly 35 percent of all questions in regular seminars (37 percent in job market talks) are classified as clarifications, followed by another 17 percent (13 percent) that are classified as comments. Suggestions, follow-ups, and criticisms each account for 10 percent or less in both regular seminars and job market talks - perhaps countering the reputation that economics is as an overly critical discipline.

Comparing male and female presenters, they find that:

...women presenters are asked 3.8 additional questions (p<0.01) relative to men (a 12 percent increase). Accounting for the influence of a range of other factors about the audience, the presenter, the topic, and the coders, reduces the differential to 2.4 questions (p<0.05). This disparity appears most pronounced during recruitment (“job market”) talks (3.7 extra questions, p<0.05) and regular seminars with an external (rather than in-house) speaker (2.6 extra questions, p<0.05)...

Although we find that women receive a greater number of suggestions and clarifying questions, we also find that they are more likely to be asked questions that are rated as patronizing or hostile.

The latter finding should be particularly of concern. Also:

Aggregating across negative tones (questions which are patronizing, disruptive, demeaning or hostile), women receive 0.5 more such questions, three quarters of which seem to be coming from men.

Dupas et al. also analyse a much more limited dataset from the NBER Summer Institute, but find broadly similar results. However, this dataset allows them to investigate whether the format of the seminars makes a difference. On that point, they find that:

...having a discussant and/or Q&A at the end does not mitigate the differential treatment of women presenters. Indeed, women receive more questions than men even in those presentations that had formal discussants... The only mitigating factor appears to be the “moratorium” on questions in the first 5 or 10 minutes of the talk: with the caveat that this represents a very small sample of presentations (N=45), we find that the moratorium completely undoes (if anything, reverses) the gender gap. And this appears to be the result of fewer “clarifying” questions that end up being deferred anyway or followed up on later when asked too early.

So, there is some suggestive evidence that the format of the seminar may make a difference. In my experience, people interrupting the speaker to ask clarifying questions are often not adding value to the talk for anyone but themselves. The takeaway from this paper, though, should be that economics still has a lot of work to do to improve the seminar climate.

The second paper is this article by Jennifer Doleac (Texas A&M University), Erin Hengel (University of Liverpool), and Elizabeth Pancotti (Employ America), published in the papers and proceedings issue of the American Economic Review (ungated version here). Doleac et al. report on the characteristics of invited seminar speakers in a panel of 66 economics departments (60 from the US, and 6 from outside the US) from 2014 to 2019. So, here we get a longitudinal dimension, although Doleac et al. don't have data on the dynamics within each seminar. They focus on the trends over time, distinguishing between genders and between under-represented minority (URM) speakers and non-URM speakers. Here's the key trends, from their Figure 1:

If you squint your eyes, you may pick up a slight upward trend in the proportion of speakers who are non-URM women, and a slight downward trend for non-URM men (the medians are less noisy, so may better represent the trends over time). However, there has not been much change for URM speakers (men or women). Indeed:

Forty-three of the 66 departments in our sample did not invite a single URM woman to speak during this period; 39 did not invite a single URM man to speak.

Maybe there's some small sign of improvement for non-URM women in terms of their proportion of invited speaking opportunities. However, Doleac et al.'s data only goes up to 2019, and it would be interesting to see how more recent trends have played out, especially given the pandemic. Fortunately, that's pretty much what the third paper does, which is this discussion paper by Marcus Biermann (UC Louvain). Biermann looks at how the coronavirus pandemic has affected economics seminars, using data from the seminar series of 270 institutions worldwide (including 243 universities, 14 central banks, 11 research institutes, and 2 international organisations). The dataset includes over 12,000 seminars over the period from 2018 to 2020. Overall:

At the seminar level, 21.8 percent of the seminars are held by female speakers. The average speaker has about 12.2 years of experience after PhD award. The top 1 percent of researchers in terms of their overall output and in terms of their publication record in the last 10 years account for 7 and 12.5 percent of seminars, respectively. The 200 top young economists held 3.3 percent of the seminars...

Looking at the impact of the pandemic, Biermann finds:

...a 7.47 percentage point increase in the relative likelihood that the seminar speaker after the technology shock is female, which is about 34.3 percent in terms of the pre-technology shock mean.

That is quite a substantial effect. Also, he finds that the increase in likelihood of a speaker being female is larger for distances between the home and host institutions of between 1475 and 5000 kilometres. Biermann suggests that:

This implies that parts of the increase in the share of female speakers are driven by a supply side response for medium length distances. The requirement to travel to medium length distant places and to stay overnight may have hindered women to accept seminar invitations before the technology shock.

 Biermann's paper also shows some other interesting trends separate from the change in gender composition of seminar speakers, including:

...that the overall number of seminars declined and that the decline was not driven by the short-run supply of speakers... The distribution of seminars speakers shifted toward researchers of better quality... The geography of knowledge dissemination changed significantly as the average distance between host and speakers’ institutions increased by 32 percent and the share of seminars across borders also increased. Finally... the inequality in presentation opportunities manifested itself in inequality in citations.

Overall, these three papers provide a lot of food for thought on the state of and trends in economics seminars and the gender gap. Clearly, the seminar climate needs to improve from where it was in 2019. The trends over time suggest only small change in the gender distribution of seminar speakers over time (and it is possible that the difficult climate for female seminar speakers may contribute to that), but the recent shift to online seminar series may be a factor in reducing the gender gap in seminar speakers. However, these three papers also leave a lot of questions unanswered, including whether the change in gender composition of speakers as a result of the pandemic has been sustained as most seminar series moved to online or hybrid formats, whether the change in gender composition was matched by a change in composition in favour of URM speakers, and importantly, if the seminar climate is a problem, how it can best be improved. Hopefully, the increasing research attention in this area will provide us with some answers to these questions soon.

[HT: Marginal Revolution for the Dupas et al. paper, and David McKenzie at the Development Impact Blog for the Biermann paper]

Read more:

Saturday, 18 December 2021

Women are more competitive when they can be more pro-social

There is an array of research that demonstrates that men are more competitive than women (and boys are more competitive than girls; e.g. see here). This effect has been amply demonstrated in laboratory experiments for example (e.g. see here). In a recent article published in the journal PLoS ONE (open access, with a non-technical summary on The Conversation), Alessandra Cassar (University of San Francisco) and Mary Rigdon (University of Arizona) investigated whether the gender gap in competitiveness in laboratory contexts arises from the payoff structure of those experiments. Cassar and Rigdon argue that:

...women may be just as competitive as men, if the incentives involved reflect the social environment.

In their experiment, which was undertaken with 238 undergraduate students at Chapman University; University of California, Santa Cruz; and Simon Fraser University, Cassar and Rigdon:

...focus on one such scenario in which the prize for winning a tournament includes a social dimension: Winners have the option to share some of the prize with one of the losers. This prosocial option, known to the participants ahead of the competition, may appeal to women who are motivated to gain control of the distribution of resources or to repair social connections post-competition.

Research participants were:

...randomly assigned to one of two treatments: Baseline or Dictator. Each treatment consists of three rounds of a real effort task, the matrix search, under varying payment schemes: a piece rate per correct answer (round 1), a mandatory tournament where all subjects experience the competitive environment (round 2), and a choice between being paid according the piece-rate scheme or the tournament scheme (round 3).

Cassar and Rigdon then look at differences in the choices made in Round 3 of the experiment, between those who won were able to share some of the proceeds of the task with others (the 'Dictator' treatment, named after the 'Dictator Game'), and those who didn't get the option to share (the 'Baseline' treatment). The outcome is well-summarised by Figure 1 in the paper (which shows the proportion of male and female participants who chose the tournament rather than the piece rate in Round 3 of the experiment):

Women are clearly more competitive (more willing to choose the tournament rather than the piece rate) in the Dictator treatment than in the Baseline treatment. This difference is highly statistically significant, and in further analysis:

...controlling for these differences in individual abilities (using performance in the mandatory round 2 tournament), risk preferences, and beliefs explains some of the gender gap in competitiveness, but leaves the interaction effect of gender and treatment reported in model 1 largely unchanged and equally significant...

So, changing the context of the reward scheme can change the incentives such that women are just as competitive as men. The question now is, how can this be translated to real-world contexts (such as tournaments in the labour market or in education)?

[HT: The Conversation]

Read more:

Friday, 10 December 2021

Gender differences in answering multiple choice in a high-stakes test

Back in October, I posted about gender differences in multiple choice answering, which was based on research that used data from the PISA survey of high school children. The results demonstrated that male students perform better than female students, which is a common feature of multiple-choice tests (see the links at the bottom of this post for more). The research also provided some suggestive evidence that confidence in answering was important, along with stereotype threat, as I noted in this 2019 post. Related to confidence, part of the difference in performance between male and female students depends on differences in the propensity to leave some questions blank. Students that are less confident (who are relatively more likely to be female students than male students) are less likely to leave an answer blank.

Now, PISA is a 'low-stakes' test, in the sense that the results don't matter for the students at all. So, there is no negative consequence to leaving a question unanswered. That isn't the case in a high stakes examination, where guessing may come with a positive net payoff (if there is no penalty for a wrong answer), or may offer no net advantage (if there is a penalty). In that case, students may differ in whether they leave questions blank, depending on their confidence and their degree of risk aversion.

The propensity of male and female students to leave questions blank is investigated in this recent article by Perihan Saygin and Ann Atwater (both University of Florida), published in the journal Economics of Education Review (sorry, I don't see an ungated version online). They use data from the Turkish OSS, which is the main college admissions examination. This examination has an interesting feature that it has several sections, and those sections have different weights for different students. As Saygin and Atwater explain:

The Turkish high school curriculum features high school students being split into tracks of their choosing in their second year. The tracks are Science-Mathematics (Quantitative), Literature-Mathematics (Equally Weighted), Social Science-Literature (Qualitative), Foreign Languages, and the Arts. These fields line up with the topics covered on the ¨OSS. It has a total of four core sections: social science (which covers history, geography, and philosophy), science (biology, chemistry, and physics), mathematics, and literature... For each subject, the exam has a lesser difficulty section and a higher difficulty one. Thus, there are 8 sections in total on the ¨OSS...

The section weights then depend on which track the students were in during high school. So, a student in the quantitative track will have a higher weighting on the maths sections, but a lower weight on the social science section, whereas a student in the social science-literature track has the opposite. This allows Saygin and Atwater to investigate how differences in the stakes associated with particular sections affects the difference in propensity to leave questions blank between male and female students. Based on a sample of 1792 randomly selected OSS participants taking the OSS examination for the first time, they find that:

...not only is the gender difference in tendency to leave questions blank largest on sections that cover mathematics, but that this difference is only significant for the test takers on the track that places the most emphasis on these sections... We also provide evidence that this gap is larger on more difficult sections of a given subject despite these sections being weighted equally to the lower difficulty sections in score calculations.

Saygin and Atwater argue in a number of places in their paper that their results demonstrate that risk aversion isn't playing a role. However, when you see the results summarised as they are above, it is hard to draw any other conclusion. If a student is worried about the risk associated with guessing, then that risk is highest on the sections of the examination where the stakes are highest. Unfortunately, Saygin and Atwater don't have any measure of risk aversion. They do have measures of confidence, and looking at those they find that:

...a positive self-assessment on a given subject is related to skipping behavior on that subject and explains part of the gender differences in tendency to leave questions blank. In addition to this, we provide evidence for gender differences in self-assessment in a given subject conditional on the performance on the corresponding test section. Male test-takers are more likely to report a positive self-assessment than their test performance would suggest in math, science, and social science while this gender difference is inverted in literature. This variation in reported self-assessment across subjects matches the pattern of the observed gender differences in tendency to leave questions blank across subjects.

So, overconfidence explains at least some of the difference in leaving questions blank. The inability to convincingly eliminate risk aversion, or the interaction between risk aversion and confidence, leaves the question of the mechanisms underlying this behaviour still open.

Also, the sample that Saygin and Atwater use is somewhat idiosyncratic. The 1792 final sample is drawn from a larger random sample of nearly 10,000 OSS students. As far as I can tell, the smaller sample arises when they eliminate students that are attempting the OSS examination for the second or subsequent time. That suggests that around 80 percent of students in the OSS make multiple attempts before they get an examination ranking that they are satisfied will gain them entry into a good Turkish university. Clearly, there is a risk of selection bias in the sample in this research that is not accounted for.

Anyway, this paper modestly contributes to the idea that confidence contributes to the gender difference in multiple choice examination performance. However, we still need more research to better understand this topic.

Read more:

Thursday, 9 December 2021

Is this increasing gender equity, or gender inequity?

A post at the Dangerous Economist pointed me to this 2020 Medium article by Koen Smets (which is worth reading in its entirety):

Motor insurance in Europe forms a very interesting case study. Traditionally, insurers charged women less, because they tend to be safer drivers, and hence make fewer and smaller claims. Unlike life expectancy, the factors determining the risk here are much more linked to individual choice and behaviour. Since 2012, an EU directive forbids insurers to use gender as an element in the calculation of the premium. So, in Q4 2011, men paid on average 17% more than the overall average premium, while women paid 20% less. In 2018, that difference with the overall average premium had shrunk to a 5% uplift for male drivers, and a 6% discount for female drivers. (The residual difference stems from the fact that men tend to drive more miles per year, in more powerful cars.) So, relatively speaking, the ‘gender equity’ intervention has increased the average premium for the lower-risk women by more than 17%, while it has cut it by just under 10% for the higher-risk men. Is this an improvement? It’s not so sure. [sic]

Insurers charge higher motor vehicle insurance premiums to male drivers, because male drivers cost the insurers more. Male drivers tend to drive more kilometres, and have more severe accidents (see also here). It is reasonable for insurers to charge a higher premium to a more costly segment of the population, especially where those drivers are more costly because of their own behaviour (men could be lower cost to insurers, if they drove differently).

Insurance is subject to an asymmetric information problem that economists refer to as adverse selection. The uninformed party (the insurer) cannot easily tell drivers with 'good' attributes (low-risk drivers) apart from drivers with 'bad' attributes (high-risk drivers). To minimise the risk to themselves of engaging in an unfavourable market transaction, it makes sense for the insurer to assume that every driver is high risk. This leads to a pooling equilibrium - low-risk drivers are grouped together with the high-risk drivers and all drivers pay the same premium, because they can't easily differentiate themselves.

Since the premium is based on drivers of all risks on average, many low-risk drivers will find the cost of insurance to be too high. They will drop out of the market. The average risk of the remaining pool of insured drivers will increase, so the insurer will need to increase the insurance premium. Medium-risk drivers may then find the premiums too high, and drop out of the market. So, insurers raise premiums again. And so on, until the market fails because only the riskiest drivers would be left, and the insurer surely doesn't want to insure them!

One way of avoiding this adverse selection problem is for insurers to try to reveal how risky a driver each insurance applicant is. That way, we would have a separating equilibrium, where high-risk drivers pay higher premiums, and low-risk driver pay lower premiums. When the uninformed party tries to reveal private information (like how risky a driver an insurance applicant is), we refer to this as screening. In this case, the insurer uses various characteristics of the insurance applicant to estimate how risky they are likely to be. These characteristics might include their insurance history, past driving behaviour, the type of car they are insuring, and their demographic characteristics (including age and gender).

By imposing a law that equalises insurance premiums for men and women, the government is essentially telling insurers that they can no longer use gender as a screening tool for determining which drivers are higher risk. In the absence of that information, the insurer is a bit less informed than before. It moves things back towards the pooling equilibrium (but not all the way, because insurers still know the other details of the applicants). At the margin, insurers will now tend to over-estimate the risk of female drivers, and under-estimate the risk of male drivers. The consequence of this is that the premium for riskier male drivers becomes lower than it would have been without the law, and the premium for female drivers becomes higher than it would have been without the law. Essentially, relatively safe female drivers are cross-subsidising relatively riskier male drivers.

Wait! Wouldn't the intention of the EU directive have been to increase equity between female and male drivers? If you have a policy that equalises insurance premiums for male and female drivers, but in so doing makes male drivers better off and female drivers worse off, is that actually increasing gender equity, or decreasing gender equity? It would be interesting to know whether the policy makers had thought about this at all.

Read more:

Tuesday, 7 December 2021

The economics of the government's plan for 'social unemployment insurance'

One of the big (and surprising) announcements in the Budget earlier this year was that the government was developing a 'social unemployment insurance' scheme. This would presumably sit alongside the current unemployment benefit system, but would work in a similar way to accident compensation, paying each person who is made unemployed (and meeting certain conditions) 80 percent of their prior wage up to a certain cap.

This would represent a significant shift in the style of social security system that New Zealand operates. In my ECONS102 class, we distinguish three types (or models) of social security system:

  1. A social assistance model - where there is an emphasis on self-reliance and responsibility, and the government provides support (often means tested) where a person would otherwise face hardship;
  2. A social insurance model - where social assistance is available and based on previous contributions to a fund (which might be an individual account, or a general account for all insured people); and
  3. A social citizenship model - where all citizens have a right to assistance for any contingencies they face (and the assistance is often not means tested).
In reality, most social security systems have features in common with all three types, but New Zealand's system up until now has mostly been a social assistance model, with the exception of accident compensation, which is clearly a social insurance scheme. This proposed introduction of social unemployment insurance would move unemployment assistance into the social insurance model (it would be interesting to see what the government would do with sickness and invalids benefits, or whether they would remain under the old system, along with sole parents and student allowances).

Anyway, there was a great article in The Conversation today by Simon Chapple and Michael Fletcher (both Victoria University of Waikato) that outlines some of the economic issues with a social insurance scheme:

However, there are two problems with the private insurance market, meaning they under-provide relative to people’s real need.

The first problem is called “adverse selection”, meaning people choosing to buy insurance have better information about the risks facing them than insurance businesses do, and no good reason to disclose that information.

To protect themselves from this, insurance companies set premiums higher. In turn, due to the costs, this leads to people being under-insured. Ultimately, society’s best interests aren’t met.

There’s also the problem of “moral hazard” – if a person has insurance they may take on more risk, without the insurer knowing exactly which customers are adopting riskier behaviour.

Again, insurance companies set higher premiums and people are generally under-insured. And again, this isn’t in society’s best interests...

These market failures mean there is potential for well-designed government interventions to meet the social interest. In particular, making everyone join a social insurance scheme would fix the adverse selection problem.

But a compulsory social insurance system also expands the scope for moral hazard. People might change their behaviour to increase their eligibility for an insurance payout. They might take on jobs with higher redundancy risks, or be less motivated to look for work, because the consequences are now less severe.

The problems of information asymmetry (including adverse selection and moral hazard) is among my favourite topics to teach in my ECONS102 class. Chapple and Fletcher are right that the unemployment social insurance scheme would not have an adverse selection problem (provided it is compulsory, in the same way that accident compensation currently is), and the key problems would be moral hazard.

To expand on the moral hazard problems a little bit, workers would be less fearful of losing their jobs, because they would receive a higher unemployment payment than previously. So, at the margin, workers would not work as hard, and productivity might decrease. Similarly, absenteeism might increase, which also reduces productivity. 

On the other hand, wages might increase. To see why, consider a search model of the labour market. This model recognises that each matching of a worker and a job creates a surplus that is shared between the worker and the employer, based on their relative bargaining power. A higher unemployment payment increases the worker's bargaining power, since they can afford to hold out for a better deal. Employers will have to offer slightly higher wages than before, in order to attract workers to leave the unemployment payment and accept the job offer. So, wages will increase, and employers will find that vacancies take a little longer to fill.

Workers may also benefit from better job matches. Since they can afford to stay on the higher unemployment insurance payment for longer, they can afford to wait and find a job they really want, rather than accept the first half-decent offer they receive. The number of unemployed will likely increase, and the average length of unemployment spells will also increase.

Clearly, there is a lot for the government to balance here. Chapple and Fletcher also note that:

If it turns out there are gaps in the current system, advocates of social insurance must also consider:

  • such a scheme may simply be substituting for one or several of the existing solutions, which would then reduce if the scheme were introduced

  • reforming and improving what already exists may be preferable in terms of cost, effectiveness and equity than introducing an entirely new system

  • there may be implications for both equity and erosion of the core welfare system of creating a separate, higher tier of assistance for some.

At this stage, all we have had from the government is an announcement, and a promise of "public consultation later in 2021". Presumably that consultation has been delayed until next year, due to the pandemic. It will be interesting to see what comes out of this.

Monday, 6 December 2021

The supply of black market vaccine passes

I was interested to read this article from The Spinoff earlier this week:

New Zealand’s traffic light system comes into play today, and perhaps inevitably, it’s being accompanied by a new black market for stolen, shared and faked vaccine passes...

A Telegram seller who, when I last spoke to them, was selling fake vaccine record cards recently made a big pivot to buying and selling official My Vaccine Passes.

The seller, “Vax Card NZ”, told me via Telegram private message on Wednesday that they were diversifying: “Just transitioning to cover the digital passes, but we still are selling the cards.”

They went on to explain that they’re trying to build up a stock of official passes with a variety of names and birth dates. “We ideally need a variety of cards to cover the base demographics,” they said, in order to be able to offer suitable options to buyers. But so far they’ve not had much luck getting official cards, and have been raising the price they’re offering to buy the passes. “We started at $50 and are now offering $125, and will continue to raise prices until we are able to purchase enough stock,” they continued.

Clear evidence that the supply curve for vaccine passes starts from a point above the x-axis (nobody is willing to sell their vaccine pass even at a price of $50), and is expected to be upward sloping ('Vax Card NZ' will continue to raise prices until they are able to purchase enough stock'). I wonder how high the price will need to go before they have enough variety of passes to re-sell?

This bit is worrying though:

As of Wednesday, Vax Card NZ reported that they hadn’t been able to buy any cards, but they were expecting that to change. “This will likely happen when the passes start to be used as people will be able to photograph other people’s passes and then sell them,” they explained, pointing out that all they needed was an image of the official QR code in order to recreate the pass for sale.

This functionally creates a market for stolen vaccine passes, incentivising people to capture images of strangers’ vaccine passes; a process Vax Card NZ has called “mining” in their online advertisements. 

I guess that, just like your credit card, you want to be careful who is scanning your My Vaccine Pass, and what they are doing with it. To be safe, perhaps each of us should be looking at the screen of the scanner, to make sure that the person doing the scanning is using the official app, and not simply taking a photo of our QR code to resell?

The only way to effectively thwart this behaviour would be for every business that is required to scan vaccine passes, to be routinely checking every pass against a photo ID. That way, it would be more difficult to pass off a fake vaccine pass as genuine. Unfortunately, there doesn't appear to be much of an incentive for businesses to have a robust process in place.

Requiring photo ID then creates problems for the small minority of people who don't have photo ID. To solve that problem, perhaps the government could subsidise people to get Kiwi Access cards? They currently cost $55 each, but they don't require a test (like a driver's licence) or citizenship (like a passport). Perhaps when a person registers with My Vaccine Pass, they could get sent a one-time voucher for a Kiwi Access card.

None of this is rocket science. We could have a vaccine pass system that works for everyone, eliminates the bulk of the black market (although those who are seriously enthusiastic about avoiding vaccination will still find a way, like getting a fake driver's licence to go with their fake vaccine pass), and doesn't meaningfully exclude sections of the population.

Saturday, 4 December 2021

Book review: Grave New World

I just finished reading Grave New World, by Stephen King (the senior economic advisor at HSBC, not the horror author). Although, some readers of this book might think it mildly horrifying in a pre-apocalyptic sense. The subtitle is "The end of globalization, and the return of history", which pitches it as antithetical to Francis Fukuyama's famous essay and book The End of History, which is probably very appropriate.

King's narrative is essentially that globalization can, and will, go into reverse. The prophesied mechanisms for this reverse are increasing inequality within countries, increasing migration flows, a loss of credibility in international institutions, and a reduction in US global hegemony as other superpowers (particularly China, but also Russia) rise. To be honest, I really struggled with this book. It is very well written and easy to read, but King's approach is mostly to gather together a lot of contemporary trends, weave a story that seems to link them all together, and propose where all this is leading. I found it overall to be mostly speculative and not very compelling.

However, as I said, it is well written and despite my failure to buy into the overall narrative, there are definitely notable highlights. I really appreciated King's use of political philosophy. In particular, he points to Montesquieu's The Spirit of the Laws, where King notes that Montesquieu argued:

...that a democratic nation state would only survive if the citizens living within its borders thought their own interests were in accord with the interests of the state as a whole... Alternatively, should citizens no longer be willing to place their faith in elected lawmakers and politicians... a democracy would eventually collapse on account of an excessive 'spirit of inequality'...

If globalization is to succeed in a world of nation states, it either needs to retain the support of nation states, or the nation states themselves need to change. Yet if each nation state experiences an increase in Montesquieu's 'spirit of inequality' - thanks to unintended or unexpected effects stemming from globalization - a point may be reached where domestic support for closer integration inevitably falters.

Among other things, King predicts the fall of the Euro currency, NATO, and the European Union within the near future (the epilogue of the book is written as if in 2044, by which time all of those falls have come to pass). Predicting the future is a sucker's game, but King is clearly up to the challenge, and is not shy. The book also has a few blind spots (notwithstanding that it was written in 2017), including the rise of Bitcoin and blockchain (which could have been foreseen four years ago), and then there's this:

Unlike previous superpowers, the US was not so interested in controlling the rest of the world. Instead it played its role as the first among equals...

I guess that could be true, if we first ignore the Korean War, the Vietnam War, American interventions in Central America, Iran and elsewhere, the Gulf Wars, Afghanistan, American dominance of the World Bank, the IMF, and the World Trade Organization, American cultural imperialism, and so on. Your mileage may vary.

Overall, this was an interesting book to read, but I would hesitate to recommend it to anyone who isn't looking to collect a variety of views on the future of globalization.

Friday, 3 December 2021

National football team performance and fertility

Like the media belief in a lockdown baby boom (see here), there is a belief in the media that sports team performances affect fertility and birth rates (e.g. see here, or here). Most stories like 'Super Bowl babies' have been proven to be a myth. However, throwing more data at a question like this is often good. That's what Luca Fumarco (Masaryk University), and Francesco Principe (University of Padova) did in this new article published in the journal Economics Letters (ungated earlier version here). Specifically, Fumarco and Principe looked at how national football (soccer) team performances at international competitions (FIFA World Cup and UEFA European Football Championship) affected the number of births nine months later, for 50 European countries.

National team performance was measured using the weighting of each match used in FIFA's Elo rating system (more on that later). Births were monthly counts. Fumarco and Principe find that:

Across all of the specifications, we see that, on average, an increase in performance by one standard deviation is associated with a reduction in monthly births by 0.3% nine months after the event.

They also perform a robustness check looking at the effect on other numbers of months after the event, and find that:

The effect of performance on monthly births is statistically significant nine months after the tournament... while the effect after ten and eleven months is not significant....

And the results were also statistically insignificant for 1-8 months after the event. So, on the surface, this seems to support the idea of a 'baby slump' rather than a baby bump from better national team performance. Fumarco and Principe conclude that:

...an increase in national team performance in international football competitions is associated with a drop in births nine months after the event...

We hypothesize that these results might be explained by individuals’ time allocations choices... the attendance of live events (e.g., from late afternoon to late night, on TV, at the stadium, on big screens in public places...) may reduce the time spent on physical intimacy...

The mechanism they propose is speculative. However, there is good reason to doubt the headline results in this study. First, I'm not convinced that their measure of national team performance is valid. They claim to use national teams' performance "as measured by the ELO rating system", but clearly they do not. The Elo rating system that FIFA uses takes into account the strength of the opposition and goal difference (see here), neither of which make an appearance in Fumarco and Principe's measure. [*] Fumarco and Principe take into account only the weighting of the match, which increases as the tournament progresses. That is a fairly crude measure of team performance, and not a whole lot better than the number of matches played, or the number of matches won. It would be interesting to see how the results panned out simply using the number of games.

Second, on a related note, Fumarco and Principe appear to use the full time series of monthly births for each country in their analysis (~17,000 observations). However, the tournaments only happen every two years, and most teams don't play in every tournament (or even any tournament). In those cases, Fumarco and Principe set the team performance variable equal to zero, which is not so different from a team that lost all of its games (which would be assignment 3 points, as they assign a minimum of 1 point per game). Including a bunch of months where there is no tournament and every country has a zero for team performance will seriously skew the results. Now, Fumarco and Principe use a variety of fixed effects, including month fixed effects, and month x year fixed effects. That will reduce some of this problem, but won't eliminate it entirely. It would be interesting to instead see how robust the results were to including only the month that is nine months after each tournament (i.e. April of each year), and applying a difference-in-differences format using countries that did not participate in each tournament as controls.

Third, there is no control for population in their model. It should be obvious enough that the number of births depends on the number of women of childbearing age. So, by excluding population size from the model there is a serious omitted variable bias. They do include country fixed effects, but that will simply reduce the size of this bias, not eliminate it.

This is a study that started with an interesting research question, but I don't think we can really take their results as given (even notwithstanding that they are correlations rather than causal). This is the sort of research that a good student could easily follow up on and improve upon.

****

[*] A side note: For a number of years, I generated Elo-type ratings for a number of international sports, along with Super Rugby and the NFL (see here). So, I have a bit of experience with these systems.