Tuesday, 30 November 2021

The effect of migrant children on native-born childrens' academic performance

Back in July, I wrote a post about how the exposure to foreign-born students affected the academic performance of students in Florida. Unsurprisingly, there is a lot of related research on the impact of immigrant students on native-born students. The effect will depend on the characteristics of the immigrant students (e.g. whether they speak the language of instruction; or the education level of their parents) and how teachers respond (e.g. do they change the way that they present material) and how schools respond (e.g. are immigrant students clustered into particular classes).

Identifying the effect of school peers (including immigrant students) on academic performance is quite difficult, mainly because there is a problem of selection bias. Parents often get to choose what school to send their children to. Schools usually get to choose how to allocate students across classrooms, sometimes on the basis of academic merit (often referred to as 'streaming'). These selection effects mean that the observed relationship between academic performance and the number (or proportion) of immigrant peers is going to be biased. Researchers must find some way to deal with the selection bias.

In this 2020 article (open access) by Kelvin Seah (National University of Singapore), published in the journal Australian Economic Review, selection bias is reduced by comparing student performance between two consecutive cohorts of students at the same school. That deals with any selection bias in relation to schools (because the comparison is within schools). Using data from entire school cohorts might deal with class selection issues (although I am not convinced - it simply means that the bias might be positive for some students, and negative for others, but there is no guarantee that it averages out to zero).

Seah uses data from the 1995 TIMSS study, for Australia, Canada, and the United States. This study collected data on maths and science performance for students in seventh and eighth grades. The data set includes over 40,000 students from over 700 schools across the three countries. Measuring exposure to immigrant students as the proportion of non-native-born students in each school cohort, Seah finds that:

There are marked differences in the share of immigrant students to which native students are exposed in the three countries. On average, natives in the Australian sample have the highest share of immigrant peers in their grade level in school while natives in the Canadian sample have the lowest.

Looking at the effects on maths achievement in the TIMSS test, Seah finds that:

For Australia... the maths achievement of native students is positively associated with the share of immigrant peers in the grade... a 10‐percentage point rise in the grade share of immigrants increases native maths achievement by about 0.093 standard deviations (significant at the 5 per cent level)...

For Canada... the share of immigrant students in the grade is negatively associated with natives’ maths achievement (this relationship is statistically significant at the 5 per cent level)... A 10‐percentage point increase in the share of immigrant grade peers is estimated to reduce native maths achievement by 0.048 standard deviations...

For the United States... Once non-random sorting of immigrant and native students across schools is taken into account, the relationship between these variables disappears and the coefficient falls to essentially zero. The result is unaltered when controls for individual, family, and school‐grade characteristics are added...

The results for Australia might seem a little surprising at first, but Seah shows that immigrant children in Australia perform better in maths than native-born children. The results are similar (but the size of the effects are much smaller) for performance in science. Digging a bit deeper into the maths results, Seah finds that:

...the peer effects of immigrant students are more adverse when immigrant students are non‐native speakers of the test language and when they have less‐educated parents. Further, the estimated peer effects of immigrant students attenuate when subject achievement of immigrant students is controlled for, suggesting that peer effects are at least partially working through immigrant students’ achievement.

There's nothing too surprising there. Seah then tries to tie the disparate results for Canada and Australia to the degree of autonomy that teachers have in setting the curriculum, and finds that:

...the results for maths achievement indicate that the peer effects of immigrants are more positive in schools where teachers have a high degree of influence in determining curriculum.

That suggests that, if we want to limit any negative effect of immigrant students on native-born students' achievement, we should allow teachers to better tailor their course offerings for their class. Unfortunately, that is almost the opposite what we might conclude from this 2018 article by Hu Feng (University of Science and Technology Beijing), published in the Journal of Comparative Economics (sorry, I don't see an ungated version online). Feng uses data from the China Education Panel Survey (CEPS), and instead of international migrants, the focus here is on internal migrants. So, at the least, there is unlikely to be much of a language effect in Feng's sample. Interestingly, Chinese middle school students (who are the focus of Feng's study) are mostly allocated to classes either randomly, or using a "balanced assignment rule" (which means that the best and worst students go in one class, the second-best and second-worst in another class, and so on). This random allocation allows Feng to deal with the class selection issue (but I'm not entirely convinced about the absence of school selection, because wealthy parents could opt their children out of public schools).

Feng looks at the effect of migrant peers on performance in maths, Chinese, and English, and finds that:

...the presence of migrant peers in the classroom has negative and statistically significant effects on math scores of local students... a ten-percentage-point increase in the proportion of migrant students in the classroom reduces local students' math test scores by 1.06 points, which is equivalent to 0.11 standard deviations...

...migrant peers have large and negative effects on local students’ Chinese test scores... a ten-percentage-point increase in the proportion of migrant students in the classroom reduces local students’ Chinese test scores by 1.06 points, which is equivalent to 0.11 standard deviations... Finally... migrant peers have negative but relatively small effects on local students’ English test scores.

Feng also finds that the results are larger for male than for female students. They then go on to look at how teachers respond to migrant students. They find that:

...in the classes with higher proportions of migrant students teachers are less likely to use the methods of group discussion and interaction with students, which are usually assumed to better improve students’ cognitive abilities... On the other hand... the presence of migrant students in the classroom has negative effects on the use of relatively advanced teaching media like multi-media projector, Internet, and pictures, models, or posters, which are important for teaching effectiveness.

So, it appears that when teachers have the ability to change the mode of instruction, they do so in ways that make local students worse off, when there are more migrant children in the classroom. This is probably exacerbated by teachers' attitudes, because Feng reports that teachers "prefer to teach classes with fewer migrant students". Unfortunately, Feng's study is silent on whether teachers modify the curriculum in response to the presence of more migrant students.

It is worth noting that the negative effects in these two studies are contrary to the findings of Figlio et al., which I referred to in my earlier post. They found no effect of immigrant exposure on native-born students. This literature is crying out for a meta-analysis at some point. We also need some further studies to unpack the mechanisms, since it is important to better understand whether changing curriculum or teaching modes, or both, to suit immigrant students has a net negative or positive effect overall.

Read more:

Monday, 29 November 2021

Teaching the minimum wage and economic justice

The conventional approach to teaching the minimum wage is to focus on the market-level impacts of the policy. This might involve the traditional model of supply and demand, or less commonly, a monopsony model or a search model of the labour market. These approaches share in common the focus on the labour market itself. Through teaching using the market as a lens, we lose focus on the workers who are receiving the minimum wage.

I recently read this interesting 2013 article by Aaron Pacitti and Scott Trees (both Siena College), published in the journal Forum for Social Economics (sorry, I don't see an ungated version online). Pacitti and Trees outline an in-class exercise for teaching the minimum wage from the perspective of workers, and in particular considering the associated social and ethical issues. The exercise is very simple, and involves groups of students working together to prepare a budget for a single 22-year-old with no spouse or dependent children, who works full-time for the minimum wage. It is not an easy task, and there are lots of trade-offs involved, and the money does not stretch very far (especially when you consider housing costs!). After the groups have completed their task, the lecturer can work through a class discussion of the various spending categories and how much each group allocated to them, which I think would further identify just how little the minimum wage can really buy. Pacitti and Trees note that:

Once students are challenged to prepare a detailed budget by spending category, they quickly realize that minimum wages do not generate enough income to support a sustainable standard of living, defined as the income necessary to maintain proper health, nutrition, and living arrangements without receiving any form of government, family, or charitable assistance. Results from using this exercise suggest that class discussion quickly moves beyond the blame-the-victim arguments that are commonly heard from students. The exercise shifts the analysis away from the market nexus and individual choice models, and toward the social, institutional, political, and ethical dimensions of minimum wages, giving students a broader and more holistic perspective on economics.

Pacitti and Trees also suggest a number of extensions, including more realistic scenarios (for people with debt repayments, those looking for a new job, or those with a family to support). They also suggest some further discussion points, including economic justice (that is, whether the outcome is ethical or fair), normative economic policy and distributional issues, and economic alternatives to the minimum wage.

All of this seems so obvious, but the problem may be trying to squeeze an exercise like this into an already crowded curriculum. However, it is likely that there are large gains to be made, not least in having students appreciate that market participants are not simply chess pieces to be moved around (which is one of the fallacies that Thomas Sowell cites in his book Economic Facts and Fallacies, which I reviewed here). And it can easily be extended into an interesting assignment for assessing students. I'll have to seriously consider whether this is something that I can fit into one of my papers.

Sunday, 28 November 2021

The productivity effects of coronavirus infection among professional footballers

It's easy to see that coronavirus infection has negative productivity effects. People who are infected are unable to work at first, and even after they return, their work performance is likely to be negatively affected for a while. However, working out how much productivity is negatively affected is difficult, because individual productivity is often difficult to measure, and infection with coronavirus may or may not be symptomatic. Measuring only the productivity effects on symptomatic cases would tend to over-estimate the actual effect of a coronavirus infection.

In a new working paperKai Fischer (Heinrich Heine University), James Reade (University of Reading), and Benedikt Schmal (Heinrich Heine University) look at the impacts on professional football (soccer) players in two top professional leagues in Europe (the Bundesliga in Germany, and Serie A in Italy). Their focus on footballers is important, because as they note:

...we are able to differentiate precisely between infected and non-infected individuals as the soccer leagues implemented rigid regimes of frequent and systematic testing for all players.

So, that gets around any problems of asymptomatic players resulting in over-estimated effect sizes. Productivity can also be precisely measured. Fischer et al. use the number of passes in a game. They justify this measure as: 

Productivity can rather be considered as a function of various health aspects; mainly physical measures, for example acceleration, condition, and endurance, but also the cognitive capability to position oneself optimally on the pitch. The number of passes is related to all of these measures, which is why we base our analysis on this parameter. We consider COVID-19 as a shock to the underlying health aspects, that consequently causes a deterioration in performance.

Fischer et al. use a difference-in-differences format, essentially comparing infected players before and after their coronavirus infection, with the performance for uninfected control players. They use a variety of public sources to identify infected players, and over the period they consider (up to the middle of July 2021):

There have been 81 true-positive tests among players in Germany and 176 in Italy until mid of July 2021. We can clearly identify 76 players in Germany and 157 in Italy...

257 infections among 1,406 players imply that 18% of all players got infected until mid of July 2021. 

That appears to be a similar infection rate than among the general population of the same age. Importantly, vaccination of players only started after the conclusion of the 2020/21 football season. Their dataset then includes 72,807 game observations on 1406 players over the 2019/20 and 2020/21 seasons (including 40,607 observations of players who played at least some game time). Looking at the effects on productivity, Fischer et al. find on the extensive margin that:

...players have a 5.7 percentage points lower probability to play. However, effects appear to be mechanical, mainly driven by the first weeks after an infection, when quarantine breaks do not allow a player to participate in a match. The observed drop in playing frequency becomes insignificant quickly, but does not fully return to its former level. These results indicate that players marginally experience persistent effects on their likelihood to play...

So, infected players are less likely to play, both immediately during their infection period, and somewhat afterwards (although the latter is not statistically significant). Similarly:

Immediately after the infection and his return on the pitch, a player spends on average 6 minutes less on the field than before – this corresponds to a decrease by almost ten percent... The effect is visible right after an infection but quite long-lasting. Only after approximately 150 days or five months minutes played return to a level which does not significantly differ from pre-infection match times.

So, infected players play less time, even when they are playing. In terms of in-game productivity (measured by the number of passes), Fischer et al. find:

...a highly significant static difference-in-differences effect of -5.1 percent. Hence, we can precisely identify a deterioration in work performance following a cured COVID-19 infection. This effect is not transient but actually remains notably negative in course of time.

So in addition to playing less, when they are playing, infected players are less productive. Note that these results control for the number of minutes played, so the reduction in the number of passes isn't a result of playing less time. Digging a bit further, they find that:

...especially players of an age above 30 face the strongest performance drops of above ten percent. In comparison, younger players up to 25 years are only affected marginally.

That result would seem to make sense, if older players' fitness is affected more severely, or older players take longer to recover from their infection. Also, in relation to fitness:

...performance declines over the course of a match. While the effect seems to be stable at around -3% in the first thirty minutes, post-infected players face a deterioration of some additional three percentage points in later phases. Such a downward trend would be in line with COVID-19 affecting player’s endurance.

Fischer et al. also demonstrate that the effect of coronavirus is larger than for other respiratory illnesses (such as colds or flu), as well as other minor injuries. They also show some suggestive evidence that there are negative spillover effects on other members of the team, from an infected player's poorer performance.

Overall, this suggests that there are significant and long-lasting negative productivity effects of coronavirus infection. That in turn suggests that the estimated negative economic impacts of the coronavirus pandemic may have been underestimated, if they fail to consider longer-term productivity impacts.

[HT: Marginal Revolution

Saturday, 27 November 2021

How and when we develop our strategic reasoning

Some years ago, my son introduced me to the "Game of 21". The first player chooses a number (1 or 2), and then players take turns incrementing the count by 1 or 2. So, for example, if the first player chooses 1, then the second player could choose either 2 or 3, but if the first player chooses 2, then the second player could choose either 3 or 4. The winner is the player that chooses 21. My son beat me handily, but he knew the winning strategy, which is to always choose a multiple of 3, if one is available. To see why that's a winning strategy, we can work backwards from 21. If you choose 18, then your opponent must choose either 19 or 20, in which case you can choose 21. If you choose 15, then your opponent must choose either 16 or 17, in which case you can choose 18, and then 21 after their next choice. And so on (15, 12, 9, 6, and 3). It turns out that there is a clear second-mover advantage in the Game of 21, since the second player can always choose 3, regardless of what the first player chooses, and the first player can never choose a multiple of 3 if the second player does so.

The winning strategy seems obvious when it is explained to you, but it is far from obvious to most people before the game begins. How long does it take for people to figure it out, and would a shorter game (say, a "Game of 6") help? Those are the research questions that this 2010 article by Martin Dufwenberg (University of Arizona), Ramya Sundaram (George Washington University), David Butler (University of Western Australia), published in the Journal of Economic Behavior and Organization (ungated version here), tackle. Using a sample of 72 research participants, they had 42 of them pair up (in a round-robin format) for five rounds of the Game of 21 (G21) followed by five rounds of the Game of 6 (G6), and 30 of them did the reverse (five rounds of G6, then five rounds of G21). Essentially, they test whether playing the simpler G6, where recognising the multiple-of-3 winning strategy is much easier, helps players to recognise the winning strategy for G21. Indeed, that's what they find. Looking only at players who play second (the 'Green' player in their wording), since only those players have a dominant strategy, they look at the proportion of players playing a 'perfect' game.

First, Dufwenberg et al. note that:

...most subjects playing five rounds of G6 realize that G6 may be solvable by rational calculation...

...most subjects playing the Green position in G21 for the first time do not immediately figure out that choosing multiples-of-three is the best they can do... Across treatments, in G21, only 49 of 179 games (27%) are played perfectly... The rates of perfect play are especially low in the early rounds of the G21-then-G6 treatment (e.g. 2 out of 20, or 10%, in round 1).

Then, turning to their main research question, Dufwenberg et al. find that:

Green players play G21 perfectly in the G6-then-G21 treatment 37% of the time, compared to 21 percent in the G21-then-G6 treatment. This difference is significant at the 5% level (Z statistic = 2.20).

They also show that, among players who appear to have figured out the winning strategy in G21, players who played G6 before G21 choose the winning strategy earlier, on average, than those who played G21 before G6. That suggests that we can learn how to optimise in difficult strategic situations if we are first presented with similar but simpler situations.

An interesting side-point of the Dufwenberg et al. paper was the reference to level-k reasoning. Level-reasoning refers to the number of steps of reasoning a decision-maker is capable of undertaking. As they note:

...level-0 players may choose randomly across all strategies. Level-1 players assume everyone else is level-0, and best respond; level-2 players assume everyone else is a level-1 player, and best respond; etc...

G6 and G21 don't really require much in the way of steps of reasoning, because once you realise what the winning strategy is, it doesn't matter much what the other player does (unless they don't know the winning strategy).

However, one game that does test level-k reasoning is the 'beauty contest'. Each player must choose a number between 0 and 100, and the winner is the player who chooses the number that is closest to two-thirds (or some other fraction) of the average of all guesses. I played this game many times with students when I was teaching a third-year Managerial Economics and Strategy paper. Level-0 reasoning would lead to a player choosing randomly. If all players did that, then the rational choice for a Level-1 reasoning player would be to choose 33 (two-thirds of the average of 50). However, if you believed that everyone else was a Level-1 reasoning player, making you a Level-2 reasoning player, then you should choose 22 (two-thirds of 33). And, if you believed that everyone else was a Level-2 reasoning player, making you a Level-3 reasoning player, they you should choose 14 (two-thirds of 22). And so on. The Nash equilibrium here is for everyone to choose 1 (or 0, depending on how the game is scored). However, the outcome is never that the winning score is 0. From memory, the winning score in my class was always around 10-20.

How many steps of reasoning do people engage in? That question has drawn a lot of research attention. One interesting aspect is how we early in life we develop level-k reasoning. That's the topic of this new article by Isabelle Brocas and Juan Carrillo (both University of Southern California), published in the Journal of Political Economy (ungated earlier version here). Brocas and Carillo created a very simple three-player game that could be easily solved by backward induction (for those who understand some game theory). As they describe it:

...subjects were matched in groups of three and assigned a role as player 1, player 2, or player 3, from now on referred to as role 1, role 2, and role 3. Each player in the group had three objects, and each object had three attributes: a shape (square, triangle, or circle), a color (red, blue, or yellow), and a letter (A, B, or C). Players had to simultaneously select one object. Role 1 would obtain points if the object he chose matched a given attribute of the object chosen by role 2. Similarly, role 2 would obtain points if the object he chose matched a given attribute of the object chosen by role 3. Finally, role 3 would obtain points if the object he chose matched a given attribute of an extra object.

The accompanying Figure 1 in the paper helps to understand the game (although the figure is in black-and-white, and the description refers to colours, which doesn't help as much as it could!):

Player 3 is asked to match the shape, so they should choose the dark square C. Player 2 is asked to match the colour that Player 3 will choose, so they should choose the dark triangle B. Notice that Player 2 needs to undertake two steps of reasoning, working out what Player 3 is doing in order to work out what they should do. Player 1 is asked to match the letter that Player 2 will choose, so they should choose the light circle B. Notice that Player 1 needs to undertake three steps of reasoning, because they must work out what Player 3 will do and then what player 2 will d, in order to work out what they should do.

Brocas and Carillo run their experiment with a number of samples of children and young adults, and each research participant played the game 18 times (six times in each of the three positions). They expect to find:

...four types of individuals: R (subjects who always play randomly), D0 (subjects who play at equilibrium only if they have a dominant strategy), D1 (subjects who play at equilibrium when they have a dominant strategy and can best respond to a D0 type), and D2 (subjects who can play as D0 and D1, as well as best respond to D1).

And in terms of behaviour:

The predicted behavior is simple. R plays the equilibrium strategy one-third of the time in all roles, D0 always plays the equilibrium strategy in role 3 and one-third of the time in roles 1 and 2, D1 always plays the equilibrium strategy in roles 2 and 3 and one-third of the time in role 1, and D2 always plays the equilibrium strategy.

Their first study involves students from third to eleventh grade from a private school in Los Angeles, along with undergraduate students from USC. Classifying the research participants into the four types outlined above, Brocas and Carillo find that:

Subjects either recognize only a dominant strategy or always play at equilibrium. Also, some very young players display an innate ability to play always at equilibrium while some young adults are unable to perform two steps of dominance.

In other words, there are no D1 players, as every player who can reason beyond one step can reason all the way through the steps. Then, looking only at the 234 grade school students in their sample, Brocas and Carillo find that:

Performance in roles 1 and 2 increases significantly up to a certain age (around 12 years old), and then stabilizes...

So, older students perform better, but only up to the age of 12 years. They then go on to replicate similar findings for a Los Angeles public school (where overall performance was lower) - there is no difference in performance from sixth to eighth grade (12-14 years old).

Finally, Brocas and Carillo study a sample of students from kindergarten to second grade. They simplify the game so that there are only two players (rather than three), and only two attributes (rather than three). With their sample of 117 children, they find that:

Equilibrium behavior is not significantly different between K and grade 1 in roles 2 and 3, and they are both lower than in grade 2 (p < .02, FDR adjusted).

The evolution of strategic behaviour as people age is interesting. That isn't quite what Brocas and Carillo are studying, since they don't follow the same children over time, but instead look across cohorts of different ages. However, it's hard to see how or why there would be a cohort effect here, so possibly they are observing an age effect. Interestingly, most of the improvement in strategic reasoning happens between the ages of 8 and 12 (second to sixth grade), and there is little improvement after that. That doesn't quite accord with the USC students performing better than the 11th-graders, so perhaps we need to know a little bit more about the evolution of strategic reasoning among older adolescents. However, in relation to younger children, Brocas and Carillo note that:

Existing research shows that by 7 years of age children may think ahead and form correct anticipations... Children have also been shown to develop inductive logic between the ages of 8 and 12...

Those are the sorts of skills that are used in developing level-k reasoning, so the mechanisms underlying the increase in strategic reasoning between ages 8 and 12 seem plausible. However, this clearly needs to be unpacked a bit more, and that would be a fruitful avenue for future research.

Overall, these two studies help us to understand a little bit more about how (and when) our reasoning in strategic games develops.

[HT: My colleague Steven Tucker for the Dufwenberg et al. study; Marginal Revolution for the Brocas and Carillo study]

Friday, 26 November 2021

Simulation evidence that alcohol minimum pricing is better than increasing excise tax

If alcohol is too cheap (see this post), then the two main policy options that the government has is to increase alcohol excise tax (which would increase the price of all alcoholic drinks), or to introduce a minimum unit price (which would increase the price of cheap alcoholic drinks, but probably leave more expensive options unchanged in price). Which is better?

On that topic, I just read this 2010 article (open access) by Robin Purshouse (University of Sheffield) and colleagues, published in the prestigious median journal Lancet. They constructed a complex simulation model from cross-sectional consumption survey and alcohol purchase data (differentiating between on-premise and off-premise purchases, and type of beverage), as well as health data, for England. Importantly, they disaggregate the effects of changes in price on groups based on the level of drinking: moderate (including non-drinkers); hazardous; and harmful. This seems to me to be one of the most thorough exercises of this type that I have seen. The most obvious flaw is the use of cross-sectional data, where longitudinal data would provide better estimates of the own-price and cross-price elasticities of the various beverage types.

They investigate a wide range of pricing policies, with different levels of change in price. Their model allows them to estimate the effects on alcohol consumption (based on own-price and cross-price elasticities), and the effects on health care costs (based on health economic models) and health gains measured in Quality-Adjusted Life Years (QALYs; based on econometric models linking consumption to alcohol-attributable medical conditions). Their findings are most easily summarised in Figure 1 from the article:

Unsurprisingly, within any type of policy, larger increases in price have more positive effects. However, the more interesting result is comparing across different policies. Purshouse et al. find that:

...notable between-policy differences exist. For example, a £0·45 minimum price would be more effective overall than a 10% general price increase, but is achieved with a much lessened effect on moderate drinkers’ spend and larger increases in spend for harmful drinkers. This differential effect arose because minimum price policies target cheap alcohol products, which make up a higher proportion of the average selection of alcohol purchases for heavier drinkers than for moderate drinkers.

So, policies that have the same overall effect on alcohol consumption can have very different effects in terms of reducing alcohol-related harm. My takeaway from the results overall is that it appears that minimum unit prices work better than increasing prices across-the-board through excise tax increases. This would accord with other research, although it is not a reason to discard excise taxes entirely.

Understanding the effects of potential policy options is important. In Purshouse et al.'s discussion of their results, they make what seems to me to be a really important point:

For policy makers, a balance between reduction in health harms and increased consumer spending might be important for proportionality, and one implication of our study is that minimum pricing strategies might help achieve this balance. For example, a general 10% price rise is estimated to reduce consumption by 4·4% and alcohol-related harm by £3·5 billion over 10 years, but a minimum price of £0·45 could produce a similar overall consumption effect, while achieving greater reductions in harm and a rebalancing of spending effect away from moderate drinkers towards heavier drinkers.

So often, public health researchers ignore the trade-offs inherent in their policy recommendations, or lack any sense of the proportionality of those recommendations. The sort of simulation exercise that Purshouse et al. conducted allows for quite a deep exploration of various pricing policy options. They make the point that their modelling approach can be used as a template for other countries. It would be great to pull together something like this for New Zealand, which might provide the evidence to support minimum unit pricing here.

Read more:

Wednesday, 24 November 2021

Coronavirus lockdowns and educational inequality in German high schools

The rapid shift to online learning affected schools (and teachers, and students) at all levels. Some schools (and teachers) were better prepared than others, having resources that were more easily adapted to online teaching modes. Some students were better prepared than others, having access to devices and stable internet connections, in order to more fully participate in online learning. The unfortunate thing is that the students who had the lowest access to online learning are likely to be those who were already under-achieving. At least, that is the headline result from this new article by Elisabeth Grewenig (Leibniz Center for European Economic Research) and co-authors, published in the journal European Economic Review (ungated earlier version here).

Grewenig et al. use data collected from 1099 parents of school-aged (i.e. not university) students in Germany, collected as part of the ifo Education Survey. The survey collected data on students' time use, both during June 2020 (when lockdowns were in effect and there was basically no in-person teaching), and retrospectively for the period before the coronavirus pandemic. Time use was separated into several categories: (1) school-related activities (school attendance; or learning for school); (2) activities 'deemed conducive to child development' (reading or being read to; playing music and creative work; or physical exercise); and (3) 'activities deemed generally detrimental' (watching television; gaming; social media; or online media); and (4) relaxing.

Comparing students' time use during the lockdowns with their time use before the pandemic, Grewenig et al. find that:

...the school closures had a large negative impact on learning time, particularly for low-achieving students. Overall, students’ learning time more than halved from 7.4 h per day before the closures to 3.6 h during the closures. While learning time did not differ between low- and high-achieving students before the closures, high-achievers spent a significant 0.5 h per day more on school-related activities during the school closures than low-achievers. Most of the gap cannot be accounted for by observables such as socioeconomic background or family situation, suggesting that it is genuinely linked to the achievement dimension. Time spent on conducive activities increased only mildly from 2.9 h before to 3.2 h during the school closures. Instead, detrimental activities increased from 4.0 to 5.2 h. This increase is more pronounced among low-achievers (+1.7 h) than high-achievers (+1.0 h). Taken together, our results imply that the COVID-19 pandemic fostered educational inequality along the achievement dimension.

So, low-achieving students (defined as those in the bottom half of the grade distribution for this sample for German and mathematics combined) reduced their study time by more than high-achieving students. To the extent that study time leads to greater academic achievement, this can only lead to an increase in the disparity in academic performance between students at the top and those at the bottom. The really disheartening finding though was that:

...only 29% of students on average had online lessons for the whole class (e.g., by video call) more than once a week. Only 17% of students had individual contact with their teacher more than once a week... The main teaching mode during the school closures was to provide students with exercise sheets for independent processing (87%)... although only 37% received feedback on the completed exercises more than once a week...

The distance-teaching measures over-proportionally reached high-achieving students. Low-achievers were 13 percentage points less likely than high-achievers to be taught in online lessons and 10 percentage points less likely to have individual contact with their teachers... Low-achievers were also less likely to be provided with educational videos or software and to receive feedback on their completed tasks.

If you thought that teachers, having scarce online teaching time available, would prioritise the low-achieving students, perhaps because the high-achieving students are more self-motivated and/or have better learning support through their parents, you would be sorely mistaken. That strongly suggests to me a failure in the way that German teachers were supported in their rapid shift to online teaching activities, since it was entirely foreseeable that low-achieving students would be more greatly affected by the changes. Alternatively, supporting the low-achieving students in low socioeconomic families to have better access to online resources would no doubt have helped as well (although New Zealand's experience suggests that something more proactive than simply having support or resources available for those who ask for it is required).

One issue with this research is the use of retrospective recall about students' time use from the period before coronavirus. Grewenig et al. argue that the degree of social desirability bias is low, and that the results are similar to those from the German Socioeconomic Panel (GSEP), where students report their own time use. However, comparing those two sources, it is clear that the reported number of hours of school-related activities before coronavirus is much higher in this sample than in the GSEP. That needn't be a problem, unless the disparity differs between parents of high-achieving students and parents of low-achieving students. Presumably, all parents are roughly equally able to observe their children's time use during lockdown. That probably is less likely of the period before coronavirus. If parents of low-achieving students are more likely to overestimate the number of hours of school-related activities than parents of high-achieving students, then that would bias the results towards showing a bigger decline in school-related activities for low-achieving students. Since those students are low-achieving, it is entirely plausible that they usually spend less time on school-related activities than their parents think they do. Unfortunately, there is no way to easily identify whether that is a problem in this sample.

With that caveat in mind, this study does point to an issue that we should be concerned about, which is how the pandemic has affected student learning, and in particular whether it has increased educational inequality. Hopefully, this is not a general result that extends beyond the German schooling system, but unfortunately it seems likely that it is.

Tuesday, 23 November 2021

The lockdown 'baby boom' in proper context is anything but a boom, and possibly not even related to the lockdowns

I was interested to read this New Zealand Herald article this week:

We've all heard the jokes about how lockdown leads to a "baby boom" - but it turns out being stuck at home does lead to a rise in birth rates.

New information from Stats NZ for the year ending in September 2021 confirms an increase in live births compared to the same time last year.

The data reveals there were 59,382 live births registered in Aotearoa, an increase from 57,753 last year.

And the fertility rate has risen slightly as well, sitting at 1.66 births per woman, up from 1.63 at the same time in 2020...

Significantly, the number of live births as at September 2021 is the highest since 2015 - long before the pandemic changed all of our lives and lockdown was the last thing on anyone's mind.

This was a little bit of a surprise, as the recent births data has shortly historically low birth rates in New Zealand. So, a 'baby boom' would come as a surprise. However, when we actually look at the data, we find that calling it a 'boom' is a mischaracterisation. Here's the data on the raw number of births by quarter in New Zealand, from 1991 to 2021 [*]:

The number of births per quarter fluctuates between about 13,500 and 16,500. There was a bit of a downward trend from 1991 to 2003, then an uptick, before the downward trend resumed from about 2009. You can see the recent rise in births at the end of the series. Indeed, the number of births is at its highest level since 2015. You might even convince yourself that this constitutes a 'baby boom'. However, then you'd also need to believe there was a boom from 2007 to 2011, where the number of births per quarter was mostly at or above the number in Q3 of 2021.

There is a problem with looking at the raw number of births though, and that is that it doesn't account for the size of the population. Population has grown a lot over the 30-year timespan shown in the graph above. To account for that, I calculated the number of births per 100 women aged 15-49 years (you can call this the period fertility rate; I use the rate per 100 women, because that makes the numbers a bit easier to interpret). [**] Here's the result for New Zealand as a whole, since 1996:

The trends are somewhat similar to the previous graph, although the overall downward trend is much more obvious. The recent increase in the birth rate is still apparent, but by itself there isn't much to suggest a 'baby boom', maybe just a slight reversal of the recent trend. As you can see, the rate is lower than it was in 2017, and for basically the entire period prior to 2013. It remains to be seen whether the increase in birth rate in Q3 of 2021 is a brief spasm in the data (similar to Q2 of 2015), or the start of a change in fertility trends. My intuition is that it is the former.

So, was this increase in births caused by lockdown? It is easy to speculate that it is, given the timing. However, we can do a little better than that. Auckland has suffered from longer periods of lockdown than the rest of the country. So, if there is a baby boom driven by lockdowns, it's likely that it would be more apparent for Auckland than for the rest of the country. That isn't what we see though. Here's the birth rates for Auckland over the period since 1996:

That doesn't look much different to New Zealand as a whole (which isn't a surprise - more than a third of the New Zealand total is contributed by Auckland). Also, if we compare the change in the birth rate across regions between Q3 of 2019 and Q3 of 2021 (I chose 2019 for the comparison, since it is the most recent year with no effect of coronavirus or lockdowns), we see this:

The biggest increase in the birth rate between 2019 and 2021 has been in the Tasman and Gisborne regions. Auckland barely features at all. The birth rate in Q3 of 2021 is actually lower in Wellington and the West Coast than it was in Q3 of 2019. It's hard to make a case that lockdowns are a cause for the increase in births, unless you can somehow make the case that the lockdowns had a bigger effect in Tasman and Gisborne, and smaller in Wellington and the West Coast (or, you can show that there is some other socio-demographic or economic effects that are able to explain the cross-region differences that seem to more than offset any impact of lockdowns). Now, you could argue that the biggest difference in the effects of lockdown between Auckland and the rest of the country is actually happening now, and so the regional differences in the effect of lockdowns should become apparent in the births data for Q1 of 2022. I guess we will wait and see for that.

So, while there has certainly been an increase in births, it is hardly a 'baby boom' (unless you have an extraordinarily liberal interpretation of what constitutes a boom). And, it is hard to make a case that it was caused by lockdown (unless lockdown and other socio-demographic or economic changes affected birth rates in different regions in some idiosyncratic way, such that Auckland ended up having a very low increase in the birth rate).

*****

[*] The data come from Statistics New Zealand, Infoshare.

[**] The calculations here are based on the births data from Infoshare, plus subnational population estimates for each region, from NZ.Stat. As the data are only provided for 30 June of each year (and only since 1996), I take the 30 June population as the denominator for the rates for Q3 of each year, and use a linear interpolation between each Q3 value to obtain population estimates for the other quarters.

Monday, 22 November 2021

Low-performing students, online teaching, and self-selection

A regular feature of this blog is highlighting some of the research on online teaching, and blended or flipped classroom teaching, and their effects on student learning (see the lengthy list of links at the end of this post for more). One common theme is that there are differences in the effects of online teaching between more-engaged or high-performing students and less-engaged or low-performing students. This 2013 article that I noticed recently, by Fletcher Lu and Manon Lemonde (both University of Ontario Institute of Technology) and published in the journal Advances in Health Sciences Education (may be open access, but just in case there is an ungated version here), also illustrates this effect.

Lu and Lemonde compare 20 students who chose to take an online statistics course and 72 students that chose to take the same course face-to-face. In addition to looking at performance for all students, they split the sample into high-performing students (those with assignment averages above the median) and low-performing students (those with assignment averages below the median).

Overall, Lu and Lemonde find no statistically significant difference in performance between students in the online and face-to-face teaching modes. However (emphasis is theirs):

For those students categorized as higher performing, their test results replicated the results of the many past studies showing no significant difference in their test performance between online versus face-to-face teaching delivery. But the students categorized as lower performing demonstrated test results that were significantly poorer for those enrolled in the online delivery version compared against their lower performing counter-parts in the face-to-face delivery version.

Now, we shouldn't overstate the significance of this particular study. The number of students was small, so it was probably underpowered to identify positive effects on the high-performing students (as have been observed in other studies). However, the bigger problem is self-selection of students into the mode of teaching. My intuition is that lower-performing (or less motivated) students disproportionately select themselves into online teaching modes. They may do this because they think that the online course will be easier than the face-to-face course (sometimes it will be, but not always), or because they mislead themselves into thinking that the added flexibility of online learning will be better for them (which it probably won't be, based on past research).

Self-selection is not just a problem for identifying the 'true' impacts of online teaching. It has real practical implications for teachers, academic departments, and universities. If we offer 'flexible' modes of teaching, where students can select into online or face-to-face teaching, we run the real risk of segregating the least capable students, and those that are least motivated, into a study mode that does real harm to their learning. It is unfortunate that Lu and Lemonde don't really test for selection effects in their sample (they say that they test for differences by comparing assignment averages, and don't find statistically significant differences, but their small sample size might account for that, and it would be much better to test for a difference in some measure of motivation, or some measure of prior achievement).

At this point, I think we really do need more research into which students actually select online teaching options rather than face-to-face. I hypothesise that, as I noted above, there is a core of less-motivated students who select online teaching. I suspect that there may also be some high-performing students who prefer the flexibility that online learning allows them (and the research highlighted in this post seems to suggest that might be the case). That will help in forming policies for flexible learning options that better suit all students.

Read more:

Saturday, 20 November 2021

Long-run inequality in the US, and the tale of two Ginis

Back in August, I posted about inequality over the long run in New Zealand back to the 1930s, based on research by John Creedy, Norman Gemmell and Loc Nguyen. The trends were interesting in their data:

Inequality was relatively high (perhaps similar to inequality today) in the 1930s and up to the early 1950s, then fell from the 1950s to the early 1980s. Inequality then rapidly increased back to its prior levels during the reforms of the late 1980s and early 1990s, and then has been relatively flat ever since.

I've previously noted the difference in New Zealand's experience of inequality from that in other countries (and this is something I draw attention to in my ECONS102 class), especially the rapid rise in inequality in early 1990s, followed by a long period where inequality has barely changed (while inequality has increased in other OECD countries). So, I'm interested to see how the longer-run data for other countries compares as well. With that in mind, I recently read this 2015 working paper by Markus Schneider (University of Denver) and Daniele Tavani (Colorado State University), which presents the long-run trend for the U.S. Here's their Figure 2:

The dark solid line in the Gini coefficient overall. Notice that the trend is quite different from the New Zealand long-run trend in two ways. First, there is a fairly continuous decrease in inequality from the 1920s to the 1940s. In contrast, Creedy et al. showed the decrease in inequality in New Zealand didn't stop until the 1950s (although that might be explained by the data that they were using). Second, there is a continuous increase in inequality from the 1940s to 2012. In contrast, Creedy et al. showed the trend in inequality in New Zealand was flat from the 1950s to the 1980s, followed by a sudden increase in the early 1990s, and then a flat trend again since.

However, aside from the long-run trend, Schneider and Tavani decompose their measure of inequality (the Gini coefficient) into two components, representing: (1) inequality at the top of the income distribution (G1 in the figure above); and (2) inequality at the bottom of the income distribution (G0 in the figure above). Looking at those measures, they find that:

...inequality at the top of the income distribution was relatively stable from the end of WWII until 1981, but has been increasing ever since...

From the end of WWII until the late 70s, increasing inequality as measured by the Gini was driven by inequality at the bottom as the distance between low-and mid-level incomes grew.

So, the long-run increase in inequality in the U.S. is a 'tale of two Ginis'. The increase was driven by inequality at the bottom from the 1940s to the early 1980s, and then driven by increases in inequality at the top from the 1980s to 2012. It would be interesting to see a similar decomposition for New Zealand, and that might explain the lack of increase in inequality in New Zealand in recent times, as there has been little change in the share of income of the top one percent (unlike for many other countries). [*]

*****

[*] If you doubt this point, see Brian Perry's excellent incomes report for the Ministry of Social Development (or this paper by Atkinson and Leigh, although it is based on older data).

Read more:

Wednesday, 17 November 2021

The persistence of economic misconceptions

Once people have made their mind up about something, they are generally unwilling to change their minds easily. We can link this to the idea of loss aversion from behavioural economics - we feel greater pain from losing something than we receive from gaining that same thing, and that applies to opinions just as much as it does to physical objects.

That unwillingness to change their minds leads people to hold a number of economic misconceptions - beliefs that are contradicted not only by economic theory, but are at odds with the beliefs of the vast majority of economists. Two examples of misconceptions are that trade is zero-sum (i.e. that there are not gains from trade for both parties), and that rent controls have generally beneficial (rather than negative) outcomes. [*] We might hope that economists could counter these misconceptions, either by teaching people some economics (but that doesn't appear to work) or by writing books (such as Economic Facts and Fallacies, which I reviewed earlier this week). However, misconceptions appear to be stubbornly persistent.

The futility of trying to address economic misconceptions is neatly on display in this new article by Jordi Brandts (Instituto de Análisis Económico), Isabel Busom (Universitat Autonoma de Barcelona), Cristina Lopez-Mayan (Universitat de Barcelona), and Judith Panadés (Universitat Autonoma de Barcelona), forthcoming in the Journal of Economic Psychology (ungated earlier version here). Busom, Lopez-Mayan and Panadés were all co-authors on earlier research on whether economics teaching could reduce misconceptions (which I discussed here), and this appears to be a follow-up to that earlier work.

In this article, Brandts et al. report on two studies that attempt to reduce misconceptions about the negative effects of rent controls. Both studies make use of a technique called a 'refutation text' (RT), which Brandts et al. explain as:

...a communication tool designed to help people revise their false beliefs through slow, analytical processing of information... Essentially, the RT must first explicitly state the belief and assert it is a misconception. It then should emphasize the negative consequences of the belief and refute it explaining the arguments and evidence obtained through scientific research. In this way the RT intends to connect this new information to the incorrect information pre-existing in a person’s memory. In addition, the RT should acknowledge the motivation for the misconceived belief.

The first study that they report on was a laboratory experiment, where research participants were allocated to one of three conditions: (1) RT; (2) non-refutational text (NRT); or (3) control. Each participant, either individually or as part of a team, answered some questions (which included their views on rent controls), then read the RT (or NRT, or neither, depending on which condition they were assigned to), and then were asked the rent control questions (along with a bunch of other questions) again, both immediately after the experiment, and several weeks later.

The second study was conducted in an economics class, across three cohorts (2015, 2017, and 2019), where:

The first cohort is exposed to a standard lecture on price controls and to a standard practice session where problems about supply, demand and price controls are solved; the second cohort is exposed to the standard lecture and to a practice session with the RT; and the third cohort is exposed to the standard lecture and to a practice session with the NRT.

The students completed questionnaires at the start and the end of their semester, which include questions about their views on rent controls.

Brandts et al. extract the effect of the RT on misconceptions by comparing the change in views between research participants (or students) in the RT group with those in the control group. They also compare RT with NRT, and NRT with control, as well as comparing those who completed individual tasks with those completing group tasks, as well as some other comparisons.

Brandts et al. present and discuss their results, but the results and their interpretations are not always in unison, and the results are not concordant across the two studies. I think they best summarise their results in their conclusion to the article:

What we learn is, first, that providing scientific information, be it through the RT or the NRT, about a salient issue such as the case of rent controls does change participants’ opinions in the direction of scientific consensus. Second, we learn that the way that the information is presented (RT vs NRT) does not make a difference... Third, a large proportion of participants still sticks to the misconception.

There is some weak evidence in the paper that the RT might have a small effect, when more reflective people discuss it in teams rather than individually. The main problem with this study was the small sample size, which did not have sufficient statistical power to detect small effects. Study 1 had 180 participants only, and while Study 2 had over 1200 students, it is based on cohort differences and there were differences in rates of attrition between the three cohorts, which may have contributed to the lack of statistical significance of the results.

The key takeaway is that this is the sort of research that is begging for additional replication studies, not just in economics but across many fields. Understanding how we can best counter misconceptions is important not just for economists, but also for public health researchers (to take a very salient current example), scientists, and researchers more generally.

*****

[*] Note that this doesn't apply to cases where there is still reasonable debate among economists, such as the disemployment effects of the minimum wage (although, see my most recent post on that topic). It also doesn't apply to misconceptions driven by underlying behavioural biases and heuristics, such as the sunk cost fallacy, or where the economic theory is more difficult to understand.

Read more:

Monday, 15 November 2021

Book review: Economic Facts and Fallacies

Over the last few years, a number of people (including some students) have recommended that I read books by Thomas Sowell. I have a few of his books, but until recently I had never read any of them. The one that I chose to begin with was the 2011 second edition of Economic Facts and Fallacies, which I just finished reading.

To be honest, I found it a bit uneven. Sowell is an excellent writer, and the book is easy to read. However, in this book he clearly had some pet hates that he wanted to air. The book sets out a grand purpose:

The purpose of all this is not simply a debunking, in order to conduct a sort of demolition derby of ideas, but to reveal fallacies that have had harmful effects on the well-being of millions of people in countries around the world. Economic policies based on fallacies can be - and have been - devastating in their impacts.

That charge against economic policies seems fair, for example there are those policies that were highlighted in Joseph Stiglitz's book Globalization and Its Discontents (which I reviewed recently here). Sowell begins the book by outlining some specific categories of fallacies, including the 'zero-sum fallacy' (that what is gained by one person must have been lost by someone else), the fallacy of composition (that what is true of a part must be true of the whole), the post hoc fallacy (that because one thing happened after another, the second event must have been caused by the first), the chess-pieces fallacy (that policy can be devised to change people's behaviour in much the same way that a chess player moves pieces on a chess board), and the open-ended fallacy (that there is no scarcity, and no matter how much is done, more could be).

Those fallacies are well explained (much better than I could do in a single sentence). However, for the most part, the first chapter is the only place where those fallacies are encountered. After the first chapter, Sowell launches into what he considers various fallacies, generally without referencing any of them back to the types he outlined in the opening chapter. That left me wondering why the need to outline those fallacies in the first place. To be fair, they are in there, especially the fallacy of composition. It is just that Sowell doesn't refer to them much, if at all.

Each chapter after the first outlines a collection of fallacies within a topic area, and outlines various facts and research that Sowell uses to argue against the fallacy. He begins with urban fallacies, and then moves onto gender, academia, income, race, and development. Sowell clearly has a lot of bugbears that he wants to counter. However, he often overstates his case, and in some instances I believe he creates his own fallacious strawman arguments in order to do so. For instance, in relation to the declining middle class:

One of the simplest statistical illusions has been created by defining the middle class by some fixed interval of income - such as between $40,000 and $60,000 - and then counting how many people are in that interval over the years... the number of middle class people declines when there is a fixed definition of "middle class" in a country with rising levels of income.

The second part of the quote is correct, but is generated by the original claim that the middle class is based on a fixed level of income. I'm unaware of anyone serious who has made that claim, and Sowell doesn't attribute that claim to anyone (despite providing a number of references throughout the book). He sets up a strawman argument, and then sets it on fire. There are several other examples where he does the same, such as his claim that:

Here are encapsulated the crucial elements in most critiques of "income distribution" to this day... In reality, most income is not distributed, so the fashionable metaphor of "income distribution" is misleading.

People refer to the 'income distribution' because it is a statistical distribution, not because they believe that income is distributed (for example, by the government). Sowell walks back that claim a couple of pages later, but I find it highly ironic that he accuses advocates of redistribution of engaging in "verbal sleight of hand". To me, a significant portion of this book is a master class in verbal sleight of hand.

One further example of this should suffice. In considering audit studies (such as the one I described here), where researchers send out job applications that differ only in the race of the applicant in order to identify the extent of racial discrimination in decisions to invite applicants to a job interview, Sowell writes:

The fallacy in this approach comes from ignoring the high cost of knowledge and the high costs of making wrong decisions. Neither objective job qualifications nor income tell the whole story for anyone of any race. Other sorting devices may be resorted to where acquiring more specific information is costly, such as seeking more detailed information from previous employers - which many former employers are reluctant to provide, given the legal risks they face when providing adverse information - or hiring private detectives to look into the private lives of job applicants, housing applicants, or applicants for loans.

If the samples in audit studies are matched across all criteria except race, then none of these other considerations matter. Sowell sells an eloquent story here, but he is really just engaging in obfuscation.

The book is not all bad, though, just uneven. On the positive side, I found myself agreeing wholeheartedly when Sowell writes:

Concerns over poverty is often confused with concern over differences in income, as if the wealth of the wealthy derives from the poverty of the poor.

This is a point that I have made before (e.g. see here or here). There are many other examples of fallacious arguments that Sowell adeptly dismantles, especially those that are used by decision-makers (like bureaucrats, or university administrators) to impose costs on others. Clearly, imposing costs on others where a decision-maker faces no consequences themselves is one of Sowell's pet hates.

Overall, the book is easy to read, and may go some way towards helping readers to recognise that people make fallacious arguments to support their preferred policy prescriptions. However, as I'll note in my next post, it is unlikely that simply exposing those fallacies and arguing against them will actually change anyone's mind.

Sunday, 14 November 2021

Financial education may be effective in raising financial literacy, but its effect on financial behaviours is more complicated

I have previously conducted research on financial literacy among teenagers, and continue to explore economic literacy among university students. As my previous research has shown (see here; with ungated earlier version here), financial literacy is low. It isn't just low among the teenagers we studied; financial literacy is low across society more generally (e.g. see here).

An obvious solution would seem to be to build up financial education. For example, making financial education compulsory in schools might increase financial literacy. So might making accessible adult financial literacy courses more available. In my reading of the previous research on financial education, it seemed to me that the evidence of the effectiveness of financial education is weak. However, I have been known to be wrong on occasion, and this might be one such occasion.

The evidence on the impact of financial education on financial literacy and financial behaviour is reviewed in this 2017 meta-analysis article by Tim Kaiser (University of Kiel) and Lukas Menkhoff (German Institute for Economic Research), published in the journal World Bank Economic Review (ungated earlier version here). Kaiser and Menkhoff's meta-analysis is based on 126 impact evaluation studies, and they summarise their findings as (emphasis is theirs):

...(i) increasing financial literacy helps. Financial education has a strong positive impact on financial literacy with an effect size of 0.26 (i.e., above the threshold value of 0.20 that characterizes “small” statistical effect sizes...). Moreover, effects on financial literacy are positively correlated with effects on financial behavior; (ii) financial education has a positive, measurable impact on financial behavior with an effect size of 0.09. An effect size of 0.08 is still found under rigorous randomized experiments (RCTs); (iii) effects of financial education depend on the target group. First, teaching low-income participants (relative to the country mean) and target groups in low- and lower-middle–income economies has less impact, which is an obvious challenge for policymakers targeting the poor. Second, it appears to be challenging to impact financial behavior as country incomes and mean years of schooling increase, probably because high baseline levels of general education and financial literacy cause diminishing marginal returns to additional financial education; (iv) success of financial education depends on the type of financial behavior targeted. We provide evidence that borrowing behavior may be more difficult to impact than saving behavior by conventional financial education; (v) increasing intensity supports the effect of financial education; and (vi) the characteristics of financial education can make a difference. Making financial education mandatory is associated with deflated effect sizes. By contrast, a positive effect is associated with providing financial education at a “teachable moment” (i.e., when teaching is directly linked to decisions of immediate relevance to the target group...).

I think there is a lot of good news that we can take away from that meta-analysis. However, notice that mandatory education doesn't make much difference, and targeting financial education at the right groups and at the right times ('teachable moments') is likely to be most effective.

The effect on financial behaviours might be the most questionable though. Kaiser and Menkhoff merge a huge variety of behavioural effects together in their meta-analysis, everything from reducing informal borrowings (e.g. from a moneylender), to having a bank account or insurance, to having a financial plan, to measures of net wealth. So, it's difficult to interpret what behaviours are actually improved by financial education. Were all financial behaviours improved? That seems unlikely. Which behaviours improved, and which did not? Did the characteristics of the target group and the type of financial education matter for which behaviours were affected? These questions are left unanswered. Also unanswered is the important question of what the mechanism for changes in financial behaviours is. We clearly need more studies linking financial education, financial literacy, and financial behaviours, but where the particular behaviours are pinned down.

That brings me to this recent article by Kenneth De Beckker, Kristof De Witte, and Geert Van Campenhout (all KU Leuven), published in the Journal of Economic Behavior and Organization (ungated earlier version here). They ran a randomised controlled trial among Flemish school students (average age 13) from 20 schools. In the trial, they gave treated students access to an online financial literacy course. De Beckker et al. explain:

The course deals with budgetary choices in everyday life. Afterwards, students are expected to be familiar with concepts like interest and inflation, have insight in different saving and investment products, understand the benefits of saving for long-term goals or unanticipated expenses, and grasp the risks of credit. The learning path consists of five modules with multiple exercises, information sheets and a formative test. The exercises contain videos, interactive learning games, and case studies adapted to the living environment of students from the eighth and ninth grade.

De Beckker et al. then measured students' financial literacy, as well as their financial behaviour, comparing students who were given access to the course with control students who were not. The 'financial behaviour' is measured using a discrete choice experiment, where students were presented with various hypothetical choices about the purchase of a new smartphone at various prices, and where some of the options required direct cash payments, whereas others had a payment plan. This research design allows them to look at how the financial education course affects the students' preferences for the smartphone purchase. That is, they can look at whether the financial education makes them more price sensitive, or more likely to avoid purchasing on credit.

Turning to their results, De Beckker et al. found that the course was effective in raising financial literacy among the students:

The results provide evidence that the financial education course is effective. Controlling for all observed heterogeneity in terms of school and student characteristics, the financial education course increases students’ financial literacy scores by 0.46 standard deviations on average...

However, in terms of financial behaviour:

Overall, we observe that the treatment does not affect how attributes like price, credit availability, information on the quality of the product and promotions are valued. This suggests that the financial education course increased students’ level of financial literacy but this did not trickle down further: students did not change their buying behavior. 

That is disappointing. The results of a single study of young Flemish teenagers is not enough to overturn the meta-analysis by Kaiser and Menkhoff. However, I think we need to consider further the mechanisms through which financial education, and financial literacy, lead to financial behaviour. Perhaps these students' hypothetical decisions were not affected, but the result might be different for financial education delivered at 'teachable moments'. Exploring the mechanisms further in future research would help policy makers and financial educators to better design financial education programmes (whether mandatory in schools, or delivered at 'teachable moments' like before a student takes out a student loan, or a first home buyer takes out a mortgage), so that financial behaviour is better aligned at improving people's long-term financial wellbeing.

Saturday, 13 November 2021

Live streamed video lectures and student achievement

As I have noted many times on this blog (see the links at the bottom of this post), I strongly believe that online learning has heterogeneous effects on student learning and achievement. As I said in my most recent post on this topic:

Students who are motivated and engaged and/or have a high level of 'self-regulation' perform at least as well in online learning as they do in a traditional face-to-face setting, and sometime perform better. Students who lack motivation, are disengaged, and/or have a low level of self-regulation flounder in online learning, and perform much worse.

The problem with a lot of the research that tries to establish the effects of online or blended or hybrid learning on student achievement is that it doesn't distinguish between the effects on students at the top of the ability distribution and the effects on students at the bottom of the ability distribution. So, it doesn't really tell us a lot about how a change in teaching practice will affect the whole distribution of student achievement - it might only tell us what will happen at the middle of the distribution, which often isn't very helpful.

One recent exception is this recent article by Paula Cacault (EPFL), Christian Hildebrand (University of St. Gallen), Jeremy Laurent-Lucchetti, and Michele Pellizzari (both University of Geneva), published in the Journal of the European Economic Association (open access, but just in case there is an ungated earlier version here). Cacault et al. investigated the effect of live streamed video lectures on student attendance behaviour and their achievement across eight compulsory management, statistics, and economics courses at the University of Geneva. Students in each class (nearly 1500 students in total) were given access to a live stream of lectures in some weeks, but not others. As they explain:

Based on the enrolment lists of each course from the e-learning platform, we first randomly assigned students to three groups. A first group of students (15% of all students) never had access to the streaming service and we label this group the Never-access. Another 15% of the students were given access to the service in all the weeks of the term and we label this group the Always-access. The remaining 70% of students were given access to streaming only some weeks at random and we label this group the Sometimes-access. Every week, a varying share of students in this group was given access. In the Spring semester 2017, we randomly assigned weekly access to 50% of the sometimes-access group. In the Fall semester 2017, we decided to vary this share between 20% and 80%...

Physical attendance in the classroom was always possible and students could freely decide to go to class in person even in the weeks when they had access to streaming...

Cacault et al. then look at student performance in the final examinations for questions drawn from lectures where they had live streaming access against those from lectures where they didn't. First, they find that the take-up of live streaming is low:

Using information from the server of the streaming platform, we can identify the students who actually accessed the service. On average, only about 5% of the students (i.e., including those who had no access) used the service at least once in each week...

Combining information on assignment and usage we also construct measures of take-up, that is, the share of students with access who logged into the platform. This share is on average around 10%–11%, ranging across weeks from a minimum of 6.7% to a maximum of 13%.

So, very few students made use of the live streaming. Cacault et al. find that there are no differences in take-up between students at different 'ability levels' (defined by their academic performance in high school). They also find only a small effect on classroom attendance (which was measured from photos taken of the classroom):

...for every 100 students who are offered lectures via live streaming about 8 of them do not show up in class.

So, the effect on learning should be pretty small, given that few students make use of the live streaming option. Indeed, Cacault et al. find that:

Results indicate that on average there is no detectable effect of the experimental assignment, nor of actual usage of the streaming platform. However, once we look at effects by ability groups, we uncover large heterogeneity, with a sizeable negative ITT [intent-to-treat] effect on the low-ability students and a positive effect on the high-ability students...

The magnitudes of these estimates are sizeable. For students in the bottom 20% of the ability distribution, having access to the streaming platform (regardless of whether one uses it or not) lowers the share of correct answers by approximately 2 percentage points over an average of about 55%. The positive effect at the top of the ability distribution is even larger and in the order of about 2.5 percentage points. The ATT [average effect of treatment on the treated] estimates are very large: around -18 percentage points for the low-ability and about +25 percentage points for the high-ability students.

The ATT results are the effects on the students who actually participated in the live streaming, rather than just being given the option to do so, which explains the much larger effect.

Now, these results relate to live streaming, so we should be cautious about over-interpreting them as applying to all online learning. However, overall these results accord with those I have discussed on the blog earlier - online learning options tend to make more motivated, higher ability students better off, and less motivated, lower ability students worse off. Cacault et al. suggest a possible mechanism that explains this difference in effects:

Consider a situation in which the streaming technology is not available. Assume that in normal times most students attend lectures in person, but when the cost of going to class is too high, the good students tend to stay at home and study on their own. This happens because very good students can read the material in the book and understand most of it easily, even without the professors presenting and explaining it, whereas students of lower ability would have a harder time learning in autonomy and prefer attending.

Introducing the streaming technology in this context would allow all students to use it when the cost of going to class happens to be high. The good students replace own study with streaming, which improves learning and leads to the observed positive effect on exam performance. The low-ability students watch the streamed lectures instead of going to class, which is a more effective (but also more costly) mode of learning, and eventually perform worse. The students in the middle range of the ability distribution, substitute streaming to attendance for small shocks and streaming to no-attendance for larger shocks. Hence, average effect on grades tend to be close to zero, as we see in our data...

It's possible that is the mechanism at play in their context, where live streaming is available but the lectures are not recording, but I'm unsure how that translates to other contexts, particularly where the lectures are recorded. I do think we need more research on this area, but in particular, we need research on how to mitigate the negative impacts of online learning on students at the bottom of the ability and motivation distribution. If universities are serious about an enduring shift to online learning, this is a problem that needs urgent attention.

Read more:

Read more: