Friday, 27 February 2026

This week in research #115

Here's what caught my eye in research over the past week (another slow week):

  • Mortágua gets deep into the theoretical weeds on the question of whether crypto-assets are money
  • Carpenter et al. (with ungated earlier version here) use 2021 Canadian Census data to look at earnings disparities experienced by nonbinary people, and find that nonbinary individuals assigned male at birth, transgender men, transgender women, and cisgender women all earn significantly less than comparable cisgender men

Also new from the Waikato working papers series:

  • Valera, Lubangco, and Holmes propose a new measure of revisions to consumer inflation expectations that uses repeated cross-sections rather than panel data, and show that individual inflation expectations are sensitive to price changes across 14 food and energy goods

Thursday, 26 February 2026

Tuition fees, incentives, and 'ghost students'

When the New Zealand government introduced 'first-year fees free' in 2018, the universities expected a big uptick in student numbers. It didn't happen (as I discussed in this 2023 post). As the figure below (source) shows, the mild downward trend in domestic student numbers (equivalent full-time students, or EFTS) continued for at least a couple of years past 2018:

My colleagues were worried that we would see an increase in the number of students who enrol, and then do nothing at all (what we call 'ghost students'). My impression was that this didn't happen, but until now I never looked intentionally at the numbers. However, the figure below shows the proportion of each of my A Trimester ECON100 classes (up to 2017) or ECONS101 classes (for 2018 onwards) that were ghost students (I didn't teach the class in 2022, which is why there is no observation for that year). Here, I define a 'ghost student' as any student who didn't attempt any of the tests or exams (although they may have attended some classes during the trimester). In each trimester, the class had between 250-350 enrolments in total. [*]

As the figure shows, there was a big jump in 'ghost students' in 2021, but that is attributable to the COVID pandemic and the weirdness of that whole time period, rather than anything to do with fees-free. In most years, somewhere between three and five percent of students are 'ghosts'. In 2025, the government shifted from first-year fees free to final-year fees free. There's no evidence that change affected the proportion of 'ghost students' either. Or it's too early to tell - the proportion in 2025 was lower than either of the previous two years.

Why might we expect the changes in fees to affect the number of 'ghost students'? It comes down to incentives. As my ECONS101 students will hear next week, when the cost of something decreases, we tend to do more of it. First-year fees free decreased the cost of being a 'ghost student', so ceteris paribus (holding all else constant), we would expect to see more 'ghost students'. Final-year fees free (with first-year fees reintroduced) increased the cost of being a 'ghost student', so ceteris paribus, we would expect to see fewer 'ghost students'. The fact that didn't happen is interesting, and we'll come back to that a bit later.

To see why the New Zealand effect might be negligible, it helps to compare with a setting where student status comes with larger immediate benefits. To do that, I want to discuss this recent article by Johannes Berens (RH Köln), Leandro Henao, and Kerstin Schneider (both University of Wuppertal), published in the journal Labour Economics (ungated earlier version here). They look at the impact of the removal of tuition fees in North Rhine-Westphalia in Germany in 2011. Tuition fees were a very modest EUR500 per year (for every year of study), and Berens et al. essentially compare students who were more or less affected by the policy (depending on how many years they didn't have to pay fees for), looking at a range of academic outcomes including exam registrations and withdrawals, credit points earned, grades, and dropout probabilities, as well as the number of 'ghost students'.

Their data come from a single university, with over 11,000 students who first enrolled between 2008 and 2011. The students in the 2008 cohort would have graduated before the fees were removed, while those in the 2011 cohort would not have faced any fees at all. The other cohorts would have had fees in their later year/s, but not earlier year/s. Applying a difference-in-differences approach, Berens et al. find that:

...abolishing tuition fees significantly affected student behavior and academic outcomes. Active students reduced their academic performance by 1.7 credit points per semester (12 % relative to baseline), despite maintaining similar exam registration patterns... Additionally, the reform increased the prevalence of ghost students by 10 percentage points...

So, removing fees in this context substantially increased the proportion of 'ghost students' by 10 percentage points, from a baseline that was already over 10 percent (Berens et al. present the data by study semester, and the 'ghost student' proportion varies between 10 percent and 20-25 percent, depending on year and study semester).

What explains the high impact of removing fees in Germany? Berens et al. highlight the role of incentives, and in particular the generous nature of public assistance available to students. Specifically:

...student status confers substantial benefits, generally independent of academic performance... These benefits include subsidized health insurance (until age 25), state-wide public transport access (worth EUR 2900 annually), and parental child allowance (EUR 2450 annually). About 16 % of students also receive need-based grants averaging EUR 6800 annually...

So, being classified as a student can be quite lucrative in Germany, even if the student is a 'ghost'. That might also explain the lack of effect of first-year fees free in New Zealand. While the fees are higher in New Zealand than in Germany, being a student in New Zealand is hardly a pathway to great riches (at least, not during the time spent as a student - see this post, and the links at the end of it). The student allowance is not very generous, and while there are some other perks to being a student, cheap movie tickets and public transport are not exactly worth a lot of money. So, it shouldn't be much surprise that the impact in Germany was much larger than for a similar policy change in New Zealand.

Another reason that the impact was not apparent in New Zealand could be that many students do not pay their tuition fees immediately. Instead, many (perhaps most) students' tuition fees are paid by student loans. 'Student Greg' is probably quite content to say that the student loan is 'Graduated Greg's' problem, and not worry about it today. So, from the perspective of 'Student Greg', first-year fees free doesn't really impact the decision to become a student or not. It doesn't change the costs of being a student for 'Student Greg', because they don't consider paying back the student loan as part of the costs of studying today. [**] And that might explain why there was no incentive effect of first-year fees free in New Zealand (also, fees-free papers are not free if students fail them, as I noted in this 2023 post).

The incentives in Germany and New Zealand, when the tuition fees were changes, resulted in quite different impacts. In Germany, where the benefits of being a student were higher, lower costs of being a 'ghost student' induced many people to enrol, whereas in New Zealand, where the benefits of being a student are lower, and the costs of tuition are typically deferred to the future, lower costs of being a 'ghost student' appear to have made no difference.

The nature of incentives, and the costs and benefits around the decision, definitely matter. The policy takeaway from this is that tinkering with fees alone may induce more (or less) 'ghost students', so the other immediate benefits and costs associated with student status also need to be considered. 

*****

[*] The data are for only one paper, but ECON100 and ECONS101 have been, for the most part, compulsory papers for business students. In a couple of years, some students could avoid the paper by taking all of the other first-year business papers. However, unless 'ghost student' status was more likely for students who did not take first-year economics, these results should be broadly representative.

[**] Essentially, 'Student Greg' is heavily discounting the future. In my ECONS102 class, we say that 'Student Greg' exhibits present bias, and is therefore only quasi-rational, not purely rational. Of course, not all students will have acted like 'Student Greg', but if enough of them did, that would explain the lack of incentive effects of the changes in first-year fees.

Tuesday, 24 February 2026

Book review: Economics (Ben Mathew)

I just finished reading Ben Mathew's 2013 book, imaginatively titled Economics. The subtitle is more descriptive though: "The remarkable story of how the economy works". The subtitle is also an accurate statement, as how the economy works does make for a remarkable story. Unfortunately, Mathew only provides a narrow (and biased) part of the story.

Don't get me wrong. This book is beautifully written, and will be easy for most non-economists to follow. I really enjoyed large parts of it. It is also quite humorous in parts. Consider this bit, which is both quite true and quite funny:

A Scottish philosopher by the name of Adam Smith figured out the answer and wrote it down in a book called An Inquiry into the Nature and Causes of the Wealth of Nations. The massive tome was published in 1776 and invented modern economics. All economists have a copy on their shelf, and some have even read parts of it.

Guilty as charged: I have a copy of The Wealth of Nations in my bookshelves, and I have even read parts of it (but not the whole book).

What lets this book down is the single-minded market fundamentalist approach. This book is everything that critics of 'neoliberal economics' love to hate. Mathew puts capitalism, the market, and prices at the centre of the 'remarkable story', which is sensible. However, he bats away or ignores critical problems with markets, such as externalities, information asymmetries, and monopoly or market power. Public goods do get a mention, but not until the last five pages of the book. Aside from public goods, the only market failures that are discussed are those caused by government intervention: price controls and taxes.

This uneven treatment isn't going to convince many readers, and those who area already skeptical of markets and economists will have cause to double-down on their skepticism. The market fundamental approach is understandable coming from Mathew, who was trained at the University of Chicago, the epicentre of 'price theory'. However, given how wonderfully Mathew writes, I feel like this was a real missed opportunity to have a book that truly describes the remarkable story of how the economy works, not based on a market-centred idealist view, but in all of its messy glory. Perhaps readers should read this book alongside Michael Sandel's What Money Can't Buy (which I reviewed here), and take the average of the two?

Monday, 23 February 2026

Migration won’t ‘solve’ ageing (and it definitely won’t solve it everywhere)

Every so often, someone wheels out the claim that migration is the obvious solution to population ageing. My previous research with Natalie Jackson (ungated version here) showed this for New Zealand overall, and for subnational (territorial authority) areas within New Zealand.

However, things are not straightforward at the subnational level. Local labour markets differ, as do the housing markets, educational and other institutions, local amenities, and job opportunities. All of these things will affect the age distribution of migrants, both into and out of a particular place. Some places attract retirees. Other places attract tertiary students, or young families. Some places do a bit of both. Other places just seem to be places that people want to flee.

In a new working paper with Courtenay Baker, we look at what’s happened to New Zealand’s working-age population (15–64) over the last quarter century (from 1998 to 2023), broken down across 66 territorial authorities and 21 Auckland local boards (TALBs), and five-year time periods. The key idea is simple: if the working-age population changes, where did that change come from?

Specifically, we disaggregate changes in the working age population into three components. The first component is 'cohort turnover', which is the the number of people ageing into the working-age population (basically, those aged 15-19 years) minus the number ageing out of the working-age population (basically, those aged 65-69 years). The second component is deaths among the working-age population. The third component is net migration at working ages, which we measures a a residual, because it can't easily be measured directly (and it is basically the change in the working age population, adjusted for deaths and cohort turnover).

Nationally, the working-age population grew in every five-year period we look at (see the table below). But the reason for that growth changes dramatically over time. In 1998-2003, the working-age population (WAP) grew 6.6 percent, and most of that change came from cohort turnover (5.5 percentage points). Migration helped (+2.1 percentage points), and deaths nudged things down a bit (-1.0 percentage points). However, by 2018-2023, the working-age population grew 5.1%, and net migration contributed 4.6 percentage points of that change. Cohort turnover only contributed 1.3 percentage points (and deaths contributed -0.8 percentage points).

So yes, the working age population is growing at the national level. And yes, migration is contributing a bigger proportion of that change over time. However, the real story here is the decline in cohort turnover as the population ages, as well as how this is playing out at the subnational level. For many TALBs, negative cohort turnover has become a reality. There were no TALBs with negative cohort turnover in the 1998-2003 period, but there were 30 TALBs that had negative cohort turnover in the 2018-2023 period. In other words, more than one-third of all areas are experiencing more people ageing out of the working-age population than the number of people entering the working-age population at young ages. This overall shift towards more negative (and less positive) cohort turnover is demonstrated in the leftward shift of points between Figure 1 (on the left, showing 1998-2003) and Figure 2 (on the right, showing 2018-2023) from the paper:

A natural response might be to say, "Those places should just try to attract more migrants." And sometimes they do! Across TALBs, cohort turnover and net migration tend to move in opposite directions (there is a moderately strong negative correlation between cohort turnover and net migration). Notice that in the two figures above, there is a downward-sloping trend line in each of them (and in each of the other five-year periods as well).

But 'sometimes' and 'tend to' are not a reliable policy prescription. When we look specifically at places with negative cohort turnover, most places do indeed offset it with positive migration, but not universally, and not consistently. In 2018-2023, two areas only partially offset negative cohort turnover (Kaikōura District and Dunedin City), and two had migration that actually made things worse (Waitematā local board and Chatham Islands Territory).

The takeaway is, again, that migration cannot be relied on to solve population ageing (or cohort turnover, in this case). Our decomposition basically shows that some places are increasingly reliant on migration to keep their working-age population from shrinking, and migration is highly unstable and cannot be relied on. In particular, migration is sensitive to policy changes at the national level, as well as sensitive to business cycle changes (and international migration, in particular, is sensitive to changes in Australia). These are things that local policy makers and planners have little control over.

To be clear, this doesn't mean that TALBs should be fatalistic about changes in the working-age population. But they need to be realistic. Not every area is Hamilton, with a young population and a growing university, attracting busloads of young people and maintaining a relatively young age structure and a growing working-age population. Not every area can aspire to have those features.

A realistic approach to planning for population ageing and a declining working-age population involves treating cohort turnover as a sort of 'warning light', and recognising that migration may not be a realistic solution. The good news is that our 'migration won’t save us' result isn’t a dead end for local areas that have declining working-age populations. It's an opportunity to improve their planning. They should treat negative cohort turnover as an early warning sign, work on realistic migration scenarios, and stress-test the basics, such as workforce needs, housing, infrastructure, and local services. Migration is a bonus when it arrives, but resilience is what they need to design for.

Sunday, 22 February 2026

Distillers don't need tax relief in order to promote their goods internationally - they already have it

Earlier this week, the NBR reported (paywalled):

Kiwi distillers are calling on the Government to introduce an excise tax rebate scheme, arguing the current system is stifling an industry that could follow wine's path from obscurity to international recognition...

The proposal requests an excise duty remission of up to $350,000 annually for each distillery, which would free up funds that could be put towards employment, expansion, and export growth.

In order to be eligible for the DSA proposed scheme, distillers would need to hold a license to manufacture distilled beverages, produce at least 70% of its alcohol content (by volume) within New Zealand, be independent, and be a member of DSA.

The proposal is modelled on Australia's excise remission scheme, which allows domestic distilleries to claim up to A$400,000 ($469,600) a year...

On the outskirts of Auckland, Pōkeno Whisky's Johns estimates about 35% of his company's domestic revenue goes toward tax. He says he holds four roles at New Zealand's largest single malt distillery – running sales, marketing, operations, and general business – but doesn't pay himself. He has halved distillation over the past 18 months because times are tough, and is investing what he can into sales and marketing in an attempt to buck the trend.

"At the end of the day, we're not selling Pōkeno Whisky overseas. We're selling brand New Zealand."

Bluff Distillery's Nash says while a spirits tax made sense historically, the system was overweighted and out of date. He says a lot of distillers that could have explored international markets haven't been able to because the lion's share of returns go toward excise.

The first thing to note is that the excise tax paid by domestic distillers is not a big money-spinner for the government. The article reports that domestic distillers pay about $23 million in excise each year. That is small relative to the overall $800 million in total alcohol excise tax collected each year (see here). The purpose of an alcohol excise tax is to reduce the consumption of a good that has negative externalities - it is an example of a Pigovian tax. Reducing excise tax would lower the price that consumers pay for alcohol, increasing consumption, and increasing the negative externalities associated with alcohol consumption. That is not a proposal that should receive broad support.

Now, I was thinking about this and I had a better idea that would give some excise tax relief for distillers, without increasing alcohol-related harm in New Zealand: zero-rate the excise tax for exports. In other words, distillers would pay excise tax only on products that they sell domestically, and not on exports. If the argument by the distillers (as noted by Matt Johns of Pōkeno Whisky in the quote above, is that they want to explore international markets, then this proposal lets them do so, and on a more level playing field with distillers overseas. The distillers will pay tax on their profits. The government doesn't really need to tax them twice. And, since by definition exports are not sold domestically, there is no increase in negative externalities from removing the excise on those exports, and there may even be a decrease [*].

It turns out my proposal already happens - there is an 'excise duty drawback' that allows distillers to claim back the excise tax paid on any goods that they export. So, the distillers are already free to 'sell brand New Zealand' to their heart's content. They don't need to have their excise on New Zealand sales reduced in order to achieve that goal. Is there a real problem here? Or is this just another case of an industry with its hand out for government support?

*****

[*] Interestingly, the zero-rating of excise tax on exports may produce a further benefit in terms of reducing alcohol consumption (and negative externalities) in New Zealand. If it becomes more profitable to produce and export distilled products, then they may choose to sell less in New Zealand. That would actually increase prices in New Zealand, reducing alcohol sales and consumption.

To see how this works, consider a distiller who could sell overseas at a price P1, receiving the price P0 after paying an excise to the government on all of their production (sold overseas, or sold locally). Call the difference in those two prices T (the excise tax), so P1 - T = P0. It makes sense for the distiller to also sell its products at the price P1 in New Zealand (if they could receive a higher price overseas, they would sell there instead), also receiving P0 after paying the excise tax. Now, what happens when the excise tax is removed for exports? Instead of receiving P0 from exports, the distiller receives P1 (since they no longer have to pay the excise tax T). They won't want to sell their products in New Zealand and receive less than P1. That only happens if they raise the price from P1 to P1 + T (which leaves the distiller with P1 after they pay the excise tax T). So, we would expect the price on distilled products to increase in New Zealand, if the excise tax were removed from exports. In other words, the 'excise duty drawback' scheme likely increases prices on distilled products in New Zealand, although in reality the 'pass-through' of tax to retail prices is likely to be somewhat less than the full amount of T.

Friday, 20 February 2026

This week in research #114

Here's what caught my eye in research over the past week (a slow week, it seems!):

  • van der Sanden et al. find that 'store within a store' vape retailers in Auckland are much more prevalent in areas of high socioeconomic deprivation
  • Henrekson and Persson (open access) review the state of competition in European football, and argue that technological change (especially global broadcasting) and labour laws that have strengthened player mobility, have led to a small number of superstars capturing a disproportionate share of the surplus generated in the football market, and worsened competitive balance

Also new from the Waikato working papers series:

  • McNamara looks at pass-through of fuel taxes to retail fuel prices, using the introduction and repeal of the 10c-per-litre Auckland regional fuel tax as a natural experiment, finding that fuel prices increased by 10.8 cents per litre following the tax introduction and fell by 11.6 cents per litre after its repeal, indicating near-complete and symmetric pass-through on average, while local competition determined the extent of pass-through for particular retailers
  • My latest working paper, co-authored with Courtenay Baker, looks at the contributions of demographic change (cohort turnover and migration) to changes in the working age population at the subnational level in New Zealand from 1998-2023, finding that cohort turnover (the difference between number of people aged 60-64 exiting the working age population, and the number of people aged 10-14 entering the working age population) is having a decreasing effect over time, meaning that local labour supply is increasingly contingent on highly variable migration flows (I'll talk more about this research in a post next week)

Thursday, 19 February 2026

What Thomas Malthus's death can teach us about how economic ideas shape government policy

Do economic ideas shape government policy? The answer seems obvious. After all, many people complain that economists have a strong influence on governments (for example, see here). But if there is a shift in prevailing economic ideas, does that flow through to government policy? It is an interesting question that is difficult to answer, so I was intrigued to read this job market paper by Eric Robertson (University of Virginia).

Robertson looks at the effect of the death of Thomas Malthus in 1834 on the decisions of British bureaucrats in colonial India. Specifically, he exploits:

...a unique historical experiment in a nineteenth-century British bureaucracy, focusing on an argument that Malthusian population theory and its associated ideas discouraged policymakers from intervening in response to agricultural distress and famine... Central to my approach is a bureaucrat training college, Haileybury, where civil servants studied prior to their careers in British India... Thomas Malthus taught economics at Haileybury for nearly three decades, from 1805 until his abrupt death in 1834, after which he was replaced by a contemporary critic, Richard Jones. I examine how the relative differences in exposure of bureaucrats to economic ideas under each instructor at Haileybury influenced their subsequent policy decisions, as well as their alignment with government directives...

Malthus and Jones had quite different views on the causes of poverty and famine, and their ideas suggested quite different policy responses. Malthus believed that it was diminishing productivity of agriculture as a result of overpopulation that caused poverty and famine, while Jones believed that capital investment and technological growth could offset diminishing agricultural productivity. So, in response to famine, bureaucrats trained by Jones would be more likely to respond with measures to supplement incomes than those trained by Malthus, with the latter believing that assistance was unnecessary in response to a natural mechanism that would ultimately lead to better living standards (because there would be fewer people after the famine).

Robertson constructs a dataset of bureaucrats ('district collectors') in colonial India and their policy decisions in response to droughts ('rainfall shortages'). Bureaucrats who wanted to actively respond to a drought had many means to do so, including:

...writing off taxes on agricultural land, opening public works to provide employment opportunities and raise wage income, distributing cash or food aid, importing food for subsidized sale, and providing loans or advances to the agricultural class...

Robertson looks separately at each of those policy responses. This 'natural experiment' is a useful way of establishing the causal impact of economics ideas on policy decisions, because the exact timing of Malthus's death was unrelated to the traits of the bureaucrats being trained. That means the bureaucrats trained just before Malthus's death, and those trained just after Malthus's death, are unlikely to be systematically different in ways related to policy decisions (other than through the way they were trained).

Robertson finds that:

...compared to their Jones-trained (Jonesian) counterparts, Malthus-trained (Malthusian) bureaucrats were less likely to provide relief across all of these common government interventions. I show that tax write-offs during drought were roughly thirty percent lower under Malthusian collectors than under Jonesians and I find evidence that expenditures on public works may have been up to twenty percent lower...

...a back-of-the-envelope calculation suggests that, if Malthusian collectors had implemented policies comparable to Jonesians, the increased aid would have translated into enough calories to support two million more person-days of subsistence during each episode of drought.

Robertson concludes that:

This research offers evidence that the exposure of bureaucrats to different types of economic ideas alters the types of policies they choose to implement.

So, the good news is that economic ideas do shape government policy (or, at the least, they did in the 19th Century). The bad news is also that economic ideas do shape government policy. Because not all economists agree. In this example, Malthus and Jones disagreed on the appropriate policy response to droughts. The Indian people were far better off (in terms of poverty relief, and probably welfare) during a drought when the bureaucrats were trained by Jones than when they were trained by Malthus. Millions of Indians died in famines in the 19th Century (see here). [*] Getting policy right has high stakes.

This research shows that what economics students are taught at university really does matter. That is both gratifying (to know that students will take on board what they learn and use it later) and a little scary (because what if they take on board the 'wrong' lessons, from the 'wrong' economics?). We could easily take a negative impression away from this, but there is a more hopeful lesson here. Economics classes, books, and arguments can make the world better, and sometimes dramatically so. That’s a pretty good reason to keep doing more of what we are doing, but also ensuring that students are equipped to take a critical view of what economics can really deliver through policy changes.

[HT: Marginal Revolution, both last year and this year]

*****

[*] Robertson reports no statistically significant impact of Malthusian training on famine or mortality, but those analyses lack the same extensive data available for the other analyses. It seems credible that the absence of assistance under Malthusian bureaucrats would have led to at least some negative impacts in the form of greater famine and/or mortality.

Wednesday, 18 February 2026

People's offsetting behaviour thwarts well-intentioned interventions in social media and smartphone use

People lead complicated lives. They have many competing goals, and have to trade off between those goals. Economists assume that they choose their actions with the overall aim of maximising their utility (satisfaction, or happiness). However, the many competing goals can sometimes thwart well-intentioned interventions. For example, when seatbelts were made compulsory, that made driving faster safer to do, and people responded by driving faster, and therefore less safely (for related examples, see here and here). Economists refer to that as offsetting behaviour.

Two recent examples of this arose in research papers I read this week. The first is this NBER Working Paper by Hunt Allcott (Stanford University) and a long list of co-authors, who investigated the impact of people temporarily deactivating Facebook or Instagram on their emotional state. Working with Meta (where some of the co-authors work), they:

...recruited 19,857 Facebook users and 15,585 Instagram users who spent at least 15 minutes per day on the respective platform. We randomly assigned 27 percent of participants to a treatment group that was offered payment for deactivating their accounts for the six weeks before the election. The remaining participants formed a control group that was paid to deactivate for just the first of those six weeks.

They then compare the difference in emotional state between before and after the deactivation for the treatment group (who deactivated for six weeks) and the control group (who deactivated for one week), and find that:

...users in the Facebook deactivation group reported a 0.060 standard deviation improvement in an index of happiness, anxiety, and depression, relative to control users...

...users in the Instagram deactivation group reported a 0.041 standard deviation improvement in the emotional state index relative to control.

Those effects are quite small in comparison to other interventions, and in comparison to changes in emotional state over time, and:

Under the approximation that emotional state index is normally distributed, the estimated effects of Facebook or Instagram deactivation would move the median user from the 50th percentile to the 52.4th or 51.6th percentile, respectively.

Why was the effect so small? Users who deactivated Facebook or Instagram spent more of their newly-freed-up time on other apps. Those who deactivate Facebook increased their use of Instagram, but also:

Facebook and Instagram deactivation both increased use of Twitter, Snapchat, TikTok, YouTube, web browsers, other social media apps, and other non-categorized apps by a few minutes per day.

It's little wonder that deactivating Facebook or Instagram had such small effects, given the offsetting behaviour of the users pivoting to using other apps, including other social media apps, instead. None of this is to say that the intervention made the users worse off, but it probably didn't make them better off overall either.

The second example is this NBER Working Paper by Billur Aksoy (Rensselaer Polytechnic Institute), Lester Lusher (University of Pittsburgh), and Scott Carrell (University of Texas at Austin), which looked at the effects of the app 'Pocket Points' at Texas A&M University. Specifically:

Pocket Points is marketed as a soft commitment device and provides incentives for students to stay off of their phones. In particular, Pocket Points rewards students with “points” for staying off their phones during class: Students open the app, lock their phone, and start accumulating points, all while the app verifies through GPS coordinates that the student is indeed in class. These points can then be used to get discounts at participating local and online businesses.

One thousand Texas A&M students were invited to participate in the experiment in 2017, and half were randomised to treatment, where they were instructed to download the Pocket Points App and create an account. Aksoy et al. then compare the treatment and control students. They also distinguish effects between those who used the app at least once, and those who used the app more than once a week (based on survey results). Importantly, first Aksoy et al. report that:

...treatment students were about 25 percentage points more likely to download the app... and over 31 percentage points more likely to use the app... than control students. Additionally, treatment students were 13 percentage points more likely to use the app more than once a week...

So, the treatment worked in encouraging students to use Pocket Points. But did it work? Aksoy et al. find some positive effects in the classroom, such as:

...Pocket Points usage is associated with a 0.42 standard deviation reduction in phone distraction rate in the classroom... we observe increases in student satisfaction with their academic performance for the semester: Students who used the app more than once a week experienced more than a one standard deviation increase in satisfaction...

That seems promising. However, when they look at student grades (from their official TAMU transcripts), Aksoy et al. find that:

...students who used the app more than once a week experienced a 0.50 unit increase in GPA. These estimates, however, are statistically insignificant...

So, even though the Pocket Points app reduced in-class distractions, it had no statistically significant effect on students' grades. That may be because there were also:

...significant decreases in time spent studying on campus... treated students spent approximately 18.2 hours/week studying, 12.0 of which were on campus, whereas control students spent 20.3 hours/week studying, 14.1 of which were on campus. Thus, it appears that the increased learning and attendance in the classroom came with a reduction in time spent studying.

It's little wonder that there was no effect on students' grades, given the offsetting behaviour of students spending less time studying, perhaps because they believed (perhaps rightly) that their in-class study time was more effective without phone distractions. None of this is to say that the app made the students worse off, but it probably didn't make them better off overall either.

When we implement an intervention that we hope will lead to better outcomes, such as improved emotional state due to less time spent on social media, or improved student performance due to more focused studying in class, we need to be prepared for the offsetting behaviour of the people affected by the intervention. Their lives are complicated, and they are trading off between competing goals. Just because we want to make one of their goals easier to achieve, that doesn't mean that they will focus extra energy on that goal. As we have seen from the two examples above, they may simply re-focus their energies elsewhere, leaving the outcome that we want to improve unchanged.

[HT: Marginal Revolution, last year]

Tuesday, 17 February 2026

Can fertility return to replacement levels?

Many countries, including almost all developed countries and many developing countries, are now experiencing below-replacement fertility, with fertility rates having declined substantially over the past decade or more. That means that each generation will be progressively smaller than the last, and almost inevitably that leads to a declining population (in the absence of offsetting migration flows). Can countries reverse the trend of declining fertility, and return to replacement levels? Two new articles suggest that might be difficult.

The first is this article by Michael Geruso and Dean Spears (both University of Texas at Austin), published in the Journal of Economic Perspectives (open access). They look explicitly at the question of whether persistently low fertility can be reversed, but first they do a great job of setting the scene:

Fertility is low or falling across the world: among high-, middle-, and low-income countries; among secular and religious populations; and in economies where the state is large and where it is small. Birth rates have been falling not only for decades, but for centuries. They have been falling for as long as there are good historical records to document them...

The TFR [total fertility rate] has fallen from a global average that was a little under five in 1950 to a global average that is a little over two in 2025...

The 115 richest countries in the world together have an average total fertility rate of 1.5... A birth rate of 1.5 would lead to a decline of 44 percent in generation size over two generations...

Geruso and Spears then look at the trends in some detail, focusing attention on completed cohort fertility (CCF), which captures the average number of lifetime births for women born in a particular place and year. That is a better measure than the TFR, because it is not affected by the timing of births - a woman having two children at ages 25 and 34 instead of aged 25 and 27 would not change the CCF, but would affect the TFR in the years in which they gave birth (increasing TFR when they were aged 34, but decreasing it when they were aged 27). In any case though, the trends in the two measures (CCF and TFR) are broadly similar, with both showing declining fertility over time across all of the countries that Geruso and Spears consider (with the exception of the US in the 1980s to 2000s, where there was a modest increase in fertility).

Geruso and Spears then use global data from the Human Fertility Database (HFD), and Indian data from the National Family Health Survey, and explore the contribution of childlessness to the overall decline in fertility. In both datasets, they find that the majority of the decline in fertility is due to a decline in the number of children among women who have at least one child, rather than an increase in childlessness. For countries in the HFD, childlessness accounts for 37 percent of the decline in fertility between the cohort of mothers born in 1956 and the cohort born in 1976, while in India, childlessness accounts for just 9 percent of the differences in fertility across districts.

Finally, Geruso and Spears turn to the prospects for a reversal of the fertility trend. On this, they start by noting that in the HFD:

...there have been 24 countries in which cohort fertility ever fell below 1.9. In none of these cases have subsequent cohorts from the same country ever had fertility as high as 2.1...

And aside from the post-WWII Baby Boom, there are no significant episodes of increasing fertility. And the Baby Boom was the result of a fairly unique set of circumstances that are (hopefully) unlikely to be repeated. Geruso and Spears then look at the microeconomic and programme evaluation literature, and note that:

...the clear-cut bottom line is that whatever impacts pro-natal policies and broader changes might have caused, none has caused low birth rates to reverse enduringly back to replacement levels.

Even a particularly strict programme in Romania that "banned abortion and made modern contraception effectively inaccessible" had only a short-term effect on the total fertility rate, and no effect on completed cohort fertility. So, this paper gives no reason to believe that declining fertility can be reversed. Geruso and Spears conclude that:

To put it bluntly, history offers no examples of societies recognizing very low birth rates as a social priority and then responding with effective changes that restore, and sustain, replacement-level fertility.

The second article is this one by Kimberly Babiarz (Stanford University), Paul Ma (University of Minnesota), Grant Miller (Stanford University), and Shige Song (City University of New York), forthcoming in the journal Review of Economics and Statistics (ungated earlier version here). They don't look explicitly at fertility decline, but they do look in detail at fertility in China, and in particular at the impact of the Wan Xi Shao (Later, Longer, Fewer) campaign, which predated the One Child Policy. That policy:

...aimed to limit fertility by promoting older age at marriage (“Later”), longer intervals between births (“Longer”), and fewer births per couple (“Fewer”).

The campaign was very successful, with the total fertility rate falling from 6 to about 2.75 over the course of the 1970s. The One Child Policy began in 1980, so by the time it was instituted, China's fertility had already fallen almost to replacement level. Babiarz et al. aren't the first to note this, but they extend the analysis further, exploiting differences in the timing of implementation of the policy across Chinese provinces to investigate how much of the decline in fertility was due to the policy, how it affected fertility decisions within Chinese families, and how many 'missing girls' are attributable to the policy implementation in combination with a societal preference for sons.

Now, as a policy the LLF aimed to:

...reduce crude annual birth rates in rural areas to 15 per 1,000 population via three primary mechanisms: (1) later marriage—delaying marriage to ages 23 and 25 (for rural women and men respectively); (2) longer birth intervals—increasing birth intervals to a minimum of four years; and (3) fewer lifetime births—limiting couples to 2–3 children in total...

The policy was implemented differently in urban areas, and about 87 percent of births in the sample occurred in rural areas, so Babiarz et al. focus attention on births in rural areas. Their main data source is the 1988 Two-per-Thousand National Survey of Fertility, which was a nationally representative survey that included around 400,000 women living in rural areas. I'm not going to go into detail on their methods (you should read the paper), but using an event study design, they find that the policy:

...reduced China’s total fertility rate by almost one birth per woman, accounting for about 30.6% of China’s overall fertility decline prior to 1980, or approximately 18.2 million averted births... Decomposing this TFR change into “quantum” and “tempo” effects, we show that, although the policy raised mothers’ median age at first birth by 5.2 months, the decline in TFR was largely the result of fewer lifetime births rather than changes in the timing of births.

They also find that:

...the LLF policy led directly to an increase in the use of both male-biased fertility-stopping rules and postnatal selection (via neglect or possible infanticide). Although postnatal selection was relatively rare, our results imply that the LLF policy resulted in about 180,000 additional missing girls, or approximately 19% of all missing girls during the 1970s.

So, the policy was quite successful in reducing Chinese fertility faster than it otherwise would have. However, this came with the unintended consequence of fewer female births relative to male births, and the phenomenon of 180,000 'missing girls' (who would have been born if the policy had not been in place).

How does this relate to the Geruso and Spears article, and what does it tell us about changing fertility? The Babiarz et al. article shows what it takes to move fertility quickly, but only in one direction (downwards). The LLF policy was dramatic, and successful, but it took a concerted government effort, supported by severe penalties, to achieve its aim. And this was in an environment where fertility was already declining. That's a very different challenge from trying to engineer a sustained increase in fertility back to replacement levels.

So, where does that leave us? If we have essentially no historical examples of societies successfully and sustainably reversing very low fertility, then the practical policy question shifts to planning for a future with progressively smaller age cohorts and older populations. That may mean reconsidering institutions that rely on a foundation of population growth (retirement and superannuation, and health and long-term care), as well as family-friendly policies and immigration settings. Policy proposals that treat women as a demographic instrument (like this one) aren’t a solution - they’re a warning sign that we’re asking policy to do something it may not be able to do.

[HT: Marginal Revolution, for the Babiarz et al. article]

Read more:

Monday, 16 February 2026

Book review: Economists in the Cold War

In 2024, I reviewed Alan Bollard's book Economists at War, noting that it sat awkwardly in-between being a biography and an economic history. I just finished reading Bollard's 2023 book Economists in the Cold War, which follows a similar approach.

This book is basically a sequel to the earlier book, and adopts a similar format, focusing on seven economists: Harry Dexter White, Oskar Lange, John von Neumann, Ludwig Erhard, Joan Robinson, Saburo Okita, and Raul Prebisch. Each chapter is devoted to the life and works (and times) of one of these eminent economists. This book differs from the earlier volume by setting each of the seven economists against one of their contemporaries, respectively: John Maynard Keynes, Friedrich Hayek, Leonid Kantorovich, Jean Monnet, Paul Samuelson, Zhou En-lai, and Walt Rostow.

There is a bit of overlap with the earlier book, which features Keynes, Kantorovich, and von Neumann. However, there is plenty of new material in this book, and I especially appreciated the chapters on Lange, Erhard, Okita, and Prebisch, who I knew little about. I also really enjoyed the chapter on Joan Robinson, which helped me to solve the mystery (to me, at least) of why she never won the Nobel Prize in Economics. On that point, Bollard writes that:

Once more, Robinson had no compunction about forming strong public views from limited evidence on contentious issues... It has been suggested that the polemical content of these writings may have cost Joan Robinson the Nobel Prize in economics which her mainstream contributions might otherwise have earned... She never saw the need to separate her economic findings and her political opinions.

Bollard has a good way of bringing in anecdotes, even though he is adamant that he is not writing a biography of each economist. On Oskar Lange, Bollard tells us that:

...he was once invited to lunch by Al Capone the famous gangster, who he found to be self-educated and well-read with a good knowledge of politics and economics. They had a most interesting conversation, and at the end Capone offered: 'Professor, if you ever have a problem, anything at all, please do not hesitate to call me!'...

It is not just any economist who can call on such support! On the negative side, there is a fair amount of repetition, both between this book and the earlier volume, and even within the book itself. For instance, Bollard twice tells us that British government economist Alex Cairncross's brother John was a spy for the Soviets, within the span of 17 pages. This, and the several other similar instances, is a minor point in an otherwise excellent book, but was quite distracting for me.

Overall, I rate this book as highly as the earlier volume, but as I noted at the beginning it suffers from a similar flaw. In trying to avoid being biography and economic history, it ends up awkwardly caught in-between. Perhaps my views have softened somewhat on this in the last couple of years, or perhaps it was that this book covered a lot of new ground for me, but I thought that overall this was the better of the two books. Like Bollard's earlier book, I recommend this one for anyone interested in the key players and in the development of 20th Century economics.

Sunday, 15 February 2026

Déjà vu: It's not a tax, it's a levy

In 2018, I mocked the government for their insistence that an increase in fuel tax was an excise, not a tax. Since I'm a firm believer in equal treatment of the government of the day when they display their economic illiteracy, I thought I needed to pick up on this story from earlier in the week:

Is it a tax? Is it a levy? An additional charge for a liquefied natural gas import terminal has turned into a communications nightmare for the Government...

Asked if this was a new tax on households, the prime minister was quick to intervene.

“This isn’t a tax, it’s a levy to fund a key piece of infrastructure,” he said.

So, it's a levy, and that is different from a tax? Not according to the OED, which defines a levy as:

Levy, n.

A duty, impost, tax.

Or, if you prefer the Merriam Webster Dictionary:

1 a : the imposition or collection of an assessment

Merriam Webster then defines an assessment as (emphasis is mine):

2 : the amount assessed : an amount that a person is officially required to pay especially as a tax

A levy is a tax. It has the same effects as a tax (for example, see this post for the details) - it raises the price that consumers pay, it lowers the effective price that sellers receive (after paying the levy to the government), it delivers revenue to the government, and it creates a deadweight loss (even if there may be offsetting benefits from how the revenue is spent). Whether the government uses that revenue for a liquefied natural gas import terminal, or for any other purpose, that doesn't change the fact that the levy is a tax.

I wrote back in that 2018 post that:

...this isn't the first (and it won't be the last) government to try their hardest not to refer to taxes as taxes.

It seems I was correct in that assessment.

Read more:

Friday, 13 February 2026

This week in research #113

Here's what caught my eye in research over the past week:

  • Babiarz et al. (with ungated earlier version here) show that most of China’s fertility decline occurred during the earlier Wan Xi Shao (Later, Longer, Fewer, LLF) campaign, rather than the One Child Policy
  • Clark and Nielsen (with ungated earlier version here) conduct a meta-analysis of studies on the returns to education and find that, after controlling for publication bias, the effects are smaller than expected (perhaps 0-3 percent per year of education, compared with 8.2 percent per year without correction for publication bias)
  • Bratti, Granato, and Havari (open access) demonstrate that a policy reducing the number of exam retakes per year at one Italian university significantly improved students’ first-year outcomes, resulting in lower dropout rates, increased exam pass rates, and enhanced credit accumulation (presumably because the students had to give the exam their best shot the first time around)
  • Buechele et al. (open access) find no systematic evidence indicating that the prestige of the doctoral degree-granting university systematically affects individuals' odds of being appointed to professorships in Germany (because the prestigious universities train a disproportionate number of the PhD graduates)

Finally, I spent today and yesterday at the New Zealand Economics Forum. I wasn't one of the speakers, so I could again enjoy the proceedings from the floor. Overall, I thought that this may have been the best Forum so far (it has been running for six years now). You can watch recordings of the sessions now (here for Day One from the main room, here for the Day One breakout room sessions, here for Day Two from the main room, and here for the Day Two breakout room sessions). Enjoy!

Wednesday, 11 February 2026

Did employers value an AI-related qualification in 2021?

Many universities are rapidly adapting to education in the age of generative AI by trying to develop AI skills in their students. There is an assumption that employers want graduates with AI skills across all disciplines, but is there evidence to support that? This recent discussion paper by Teo Firpo (Humboldt-Universität zu Berlin), Lukas Niemann (Tanso Technologies), and Anastasia Danilov (Humboldt-Universität zu Berlin) provides an early answer. I say it's an early answer because their data come from 2021, before the wave of generative AI innovation that became ubiquitous following the release of ChatGPT at the end of 2022. The research also focuses on AI-related qualifications, rather than the more general AI skills, but it's a start.

Firpo et al. conduct a correspondence experiment, where they:

...sent 1,185 applications to open vacancies identified on major UK online job platforms... including Indeed.co.uk, Monster.co.uk, and Reed.co.uk. We restrict applications to entry-level positions requiring at most one year of professional experience, and exclude postings that demand rare or highly specialized skills...

Each identified job posting is randomly assigned to one of two experimental conditions: a "treatment group", which receives a résumé that includes additional AI-related qualifications and a "control group", which receives an otherwise identical résumé without mentioning such qualifications.

Correspondence experiments are relatively common in the labour economics literature (see here, for example), and involve the researcher making job applications with CVs (and sometimes cover letters) that differ in known characteristics. In this case, the applications differed by whether the CV included an AI-related qualification or not. Firpo et al. then focus on differences in callback rates, and they differentiate between 'strict callbacks' (invitations to interview), and 'broad callbacks' (any positive employer response, including requests for further information). Comparing callback rates between CVs with and without AI-related qualifications, they find:

...no statistically significant difference between treatment and control groups for either outcome measure...

However, when they disaggregate their results by job function, they find that:

In both Marketing and Engineering, résumés listing AI-related qualifications receive higher callback rates compared to those in the control group. In Marketing, strict callback rates are 16.00% for AI résumés compared to 7.00% for the control group (p-value = 0.075...), while broad callback rates are 24.00% versus 12.00% (p-value = 0.043...). In Engineering, strict callback rates are 10.00% for AI résumés compared to 4.00% for the control group (p-value = 0.163...), while broad callback rates are 20.00% versus 8.00% (p-value = 0.024...).

For the other job functions (Finance, HR, IT, and Logistics) there was no statistically significant effect of AI qualifications on either measure of callback rates. Firpo et al. then estimate a regression model and show that:

...including AI-related qualifications increases the probability of receiving an interview invitation for marketing roles by approximately 9 percentage points and a broader callback by 12 percentage points. Similarly, the interaction between the treatment dummy and the Engineering job function dummy in the LPM models is positive and statistically significant, but only for broad callbacks. AI-related qualifications increase the probability of a broad callback by at least 11 percentage points...

The results from the econometric model are only weakly statistically significant, but they are fairly large in size. However, I wouldn't over-interpret them because of the multiple-comparison problem (around five percent of results would show up as statistically significant just by chance). At best, the evidence that employers valued AI-related qualifications in 2021 is pretty limited, based on this research.

Firpo et al. were worried that employers might not have noticed the AI qualifications in the CVs, so they conducted an online survey of over 700 professionals with hiring experience and domain knowledge, but that survey instead shows that the AI-related qualification was salient and a signal of greater technical skills, but lower social skills. These conflicting signals are interesting, and suggestive that employers are looking for both technical skills and social skills in entry-level applicants. Does this, alongside the earlier results for different job functions, imply that technical skills are weighted more heavily than social skills for Engineering and Marketing jobs? I could believe that for Engineering, but for Marketing I have my doubts, because interpersonal skills are likely to be important in Marketing. Again though, it's probably best not to over-interpret the results.

Firpo et al. conclude that:

...our findings challenge the assumption that AI-related qualifications unambiguously enhance employability in early-career recruitment. While such skills might be valued in abstract or strategic terms, they do not automatically translate into interview opportunities, at least not in the entry-level labor market in job functions such as HR, Finance, Marketing, Engineering, IT and Logistics.

Of course, these results need to be considered in the context of their time. In 2021, AI-related skills might not have been much in demand by employers. That is unlikely to hold true now, given that generative AI use has become so widespread. It would be interesting to see what a more up-to-date correspondence experiment would find.

[HT: Marginal Revolution]

Read more:

  • ChatGPT and the labour market
  • More on ChatGPT and the labour market
  • The impact of generative AI on contact centre work
  • Some good news for human accountants in the face of generative AI
  • Good news, bad news, and students' views about the impact of ChatGPT on their labour market outcomes
  • Swiss workers are worried about the risk of automation
  • How people use ChatGPT, for work and not
  • Generative AI and entry-level employment
  • Survey evidence on the labour market impacts of generative AI
  • Tuesday, 10 February 2026

    Who on earth has been using generative AI?

    Who are the world's generative AI users? That is the question addressed in this recent article by Yan Liu and He Wang (both World Bank), published in the journal World Development (ungated earlier version here). They use website traffic data from Semrush, alongside Google Trends data, to document worldwide generative AI use up to March 2024 (so, it's a bit dated now, as this is a fast-moving area, but it does provide an interesting snapshot up to that point). In particular, Liu and Wang focus on geographical heterogeneity in generative AI use (measured as visits to generative AI websites, predominantly, or in some of their analyses, entirely ChatGPT), and they explore how that relates to country-level differences in institutions, infrastructure, and other variables.

    Some of the results are fairly banal, such as the rapid increase in website traffic to AI chatbot websites, a corresponding decline in traffic to sites such as Google, and Stack Overflow, and that the users skew younger, more educated, and male. Those demographic differences will likely become less dramatic over time as user numbers increase. However, the geographic differences are important and could be more persistent. Liu and Wang show that:

    As of March 2024, the top five economies for ChatGPT traffic are the US, India, Brazil, the Philippines, and Indonesia. The US share of ChatGPT traffic dropped from 70 % to 25 % within one month of ChatGPT’s debut. Middle-income economies now contribute over 50 % of traffic, showing disproportionately high adoption of generative AI relative to their GDP, electricity consumption, and search engine traffic. Low-income economies, however, represent less than 1 % of global ChatGPT traffic.

    So, as of 2024, most generative AI use was in middle-income countries, but remember that those are also high-population countries (like India). Generative AI users are disproportionately from high-income countries once income and internet use (proxied by search engine traffic) are accounted for. Figure 12 in the paper illustrates this nicely, showing generative AI use, measured as visits per internet user:

    Notice that the darker-coloured countries, where a higher proportion of internet users used ChatGPT, are predominantly in North America, western Europe, and Australia and New Zealand. On that measure, Liu and Wang rank New Zealand 20th (compared with Singapore first, and Australia eighth). There are a few interesting outliers like Suriname (sixth) and Panama (17th), but the vast majority of the top twenty countries are high-income countries.

    What accounts for generative AI use at the country level? Using a cross-country panel regression model, Liu and Wang find that:

    Higher income levels, a higher share of youth population, bet-ter digital infrastructure, and stronger human capital are key predictors of higher generative AI uptake. Services’ share of GDP and English fluency are strongly associated with higher chatbot usage.

    Now, those results simply demonstrate correlation, and are not causal. And website traffic could be biased due to use of VPNs, etc., not to mention that it doesn't account very well for traffic from China or Russia (and Liu and Wang are very upfront about that limitation). Nevertheless, it does provide a bit more information about how countries with high generative AI use differ from those with low generative AI use. Generative AI has the potential to level the playing field somewhat for lower-productivity workers, and lower-income countries. However, that can only happen if lower-income countries access generative AI. And it appears as if, up to March 2024 at least, they are instead falling behind. As Liu and Wang conclude, any catch-up potential from generative AI:

    ...depends on further development as well as targeted policy interventions to improve digital infrastructure, language accessibility, and foundational skills.

    To be fair, that sounds like a general prescription for development policy in any case.

    Read more:

    Monday, 9 February 2026

    The promise of a personalised, AI-augmented textbook, and beyond

    In the 1980s, the educational psychologist Benjamin Bloom introduced the 'two-sigma problem' - that students who were tutored one-on-one using a mastery approach performed on average two standard-deviations (two-sigma) better than students educated in a more 'traditional' classroom setting. That research is often taken as a benchmark for how good an educational intervention might be (relative to a traditional classroom baseline). The problem, of course, is that one-on-one tutoring is not scalable. It simply isn't feasible for every student to have their own personal tutor. Until now.

    Generative AI makes it possible for every student to have a personalised tutor, available 24/7 to assist with their learning. As I noted in yesterday's post though, it becomes crucial how that AI tutor is set up, as it needs to ensure that students engage meaningfully in a way that promotes their own learning, rather than simply being a tool to 'cognitively offload' difficult learning tasks.

    One promising approach is to create customised generative AI tools, that are specifically designed to act as tutors or coaches, rather than simple 'answer-bots'. This new working paper by the LearnLM team at Google (and a long list of co-authors) provides one example. They describe an 'AI-augmented textbook', which they call the 'Learn Your Way' experience, which:

    ...provides the learner with a personalized and engaging learning experience, while also allowing them to choose from different modalities in order to enhance understanding.

    Basically, this initially involves taking some source material, which in their case is a textbook, but could just as easily be lecture slides, transcripts, and related materials from a class. It then personalises those materials to the interests of the students, adapting the examples and exercises to fit a context that the students find more engaging. For example, if the student is an avid football fan, they might see examples drawn from football. And if the student is into Labubu toys, they might see examples based on that.

    The working paper describes the approach, reports a pedagogical evaluation performed by experts, and finally reports on a randomised controlled trial (RCT) evaluating the impact of the approach on student learning. The experts rated the Learn Your Way experience across a range of criteria, and the results were highly positive. The only criterion where scores were notably low was for visual illustrations. That accords with my experience so far with AI tutors, which are not good at drawing economics graphs, in particular (and is an ongoing source of some frustration!).

    The RCT involved sixty high-school students in Chicago area schools, who studied this chapter on brain development of adolescents. Half of the students were assigned to Learn Your Way, and half to a standard digital PDF reader. As the LearnLM Team et al. explain:

    Participants then used the assigned tool to study the material. Learning time was set to a minimum of 20 minutes and a maximum of 40 minutes. After this time, each participant had 15 minutes to complete the Immediate Assessment via a Qualtrics link.

    They then did a further assessment three days later (a 'Retention Assessment'). In terms of the impact of Learn Your Way:

    The students who used Learn Your Way received higher scores than those who used the Digital Reader, in both the immediate (p = 0.03) and retention (p = 0.03) assessments.

    The difference in test outcomes was 77 percent vs. 68 percent in the Immediate Assessment, and 78 percent vs. 67 percent in the Retention Assessment. So, the AI-augmented textbook increased student learning and retention by about 10 percentage points in both immediate learning and in the short term (three days). Of course, this was just a single study with a relatively small sample size of 60 students in a single setting, but it does offer some promise for the approach.

    I really like this idea of dynamically adjusting content to suit students' interests, which is a topic I have published on before. However, using generative AI in this way allows material to be customised for every student, creating a far more personalised approach to learning than any teacher could offer. I doubt that even one-on-one tutoring could match the level of customisation that generative AI could offer.

    This paper has gotten me thinking about the possibilities for personalised learning. Over the years, I have seen graduate students with specific interests left disappointed by what we are able to offer in terms of empirical papers. For example, I can recall students highly interested in economic history, the economics of education, and health economics in recent years. Generative AI offers the opportunity to provide a much more tailored education to students who have specific interests.

    This year, I'll be teaching a graduate paper for the first time in about a decade. My aim is to allow students to tailor that paper to their interests, by embarking on a series of conversations about research papers based on their interests. The direction that leads will be almost entirely up to the student (although with some guidance from me, where needed). Students might adopt a narrow focus on a particular research method, a particular research question, or a particular field or sub-field of economics. Assisted by a custom generative AI tool, they can read and discuss papers, try out replication packages, and/or develop their own ideas. Their only limits will be how much time they want to put into it. Of course, some students will require more direction than others, but that is what our in-class discussion time will be for.

    I am excited by the prospects of this approach, and while it will be a radical change to how our graduate papers have been taught in the past, it might offer a window to the future. And best of all, I have received the blessing of my Head of School to go ahead with this as a pilot project that might be an exemplar for wider rollout across other papers. Anyway, I look forward to sharing more on that later (as I will turn it into a research project, of course!).

    The ultimate question is whether we can use generative AI in a way that moves us closer to Bloom’s two-sigma benefit of one-on-one tutoring. The trick will be designing it so that students still do the cognitive work. My hope (and, it seems, the LearnLM team’s) is that personalisation increases students' engagement with learning rather than replacing it. If it works, this approach could be both effective and scalable in a way that human one-on-one tutoring simply can’t match.

    [HT: Marginal Revolution, for the AI-augmented textbook paper]

    Sunday, 8 February 2026

    Neuroscientific insights into learning and pedagogy, especially in the age of generative AI

    In May last year, my university's Centre for Tertiary Teaching and Learning organised a seminar by Barbara Oakley of Oakland University, with the grand title 'The Science of Learning'. It was a fascinating seminar about the neuroscience of learning, and in my mind, it justified several of my teaching and learning practices, such as continuing to have lectures, to emphasise students' learning basic knowledge in economics, and retrieval practice and spaced repetition as learning tools.

    Now, I've finally read the associated working paper by Oakley and co-authors (apparently forthcoming as a book chapter), and I've been able to pull out further insights that I want to share here. The core of their argument is in the Introduction to the paper. First:

    Emerging research on learning and memory reveals that relying heavily on external aids can hinder deep understanding. Equally problematic, however, are the pedagogical approaches used in tandem with reliance on external aids—that is, constructivist, often coupled with student-centered approaches where the student is expected to discover the insights to be learned... The familiar platitude advises teachers to be a guide on the side rather than a sage on the stage, but this oversimplifies reality: explicit teaching—clear, structured explanations and thoughtfully guided practice—is often essential to make progress in difficult subjects. Sometimes the sage on the stage is invaluable.

    I have resisted the urge to move away from lectures as a pedagogical tool, although I'd like to think that my lectures are more than simply information dissemination. I actively incorporate opportunities for students to have their first attempts at integrating and applying the economic concepts and models they are learning - the first step in an explicit retrieval practice approach. Oakley et al. note the importance of both components, because:

    ...mastering culturally important academic subjects—such as reading, mathematics, or science (biologically secondary knowledge)—generally requires deliberate instruction... Our brains simply aren’t wired to effortlessly internalize this kind of secondary knowledge—in other words, formally taught academic skills and content—without deliberate practice and repeated retrieval.

    The paper goes into some detail about the neuroscience underlying this approach, but again it is summarised in the Introduction:

    At the heart of effective learning are our brain's dual memory systems: one for explicit facts and concepts we consciously recall (declarative memory), and another for skills and routines that become second nature (procedural memory). Building genuine expertise often involves moving knowledge from the declarative system to the procedural system—practicing a fact or skill until it embeds deeply in the subconscious circuits that support intuition and fluent thinking...

    Internalized networks form mental structures called schemata, (the plural of “schema”) which organize knowledge and facilitate complex thinking... Schemata gradually develop through active engagement and practice, with each recall strengthening these mental frameworks. Metaphors can enrich schemata by linking unfamiliar concepts to familiar experiences... However, excessive reliance on external memory aids can prevent this process. Constantly looking things up instead of internalizing them results in shallow schemata, limiting deep understanding and cross-domain thinking.

    This last point, about the shallowness of learning when students rely on 'looking things up' instead of relying on their own memory of key facts (and concepts and models, in the case of economics), leads explicitly to worries about learning in the context of generative AI. When students rely on external aids (known as 'cognitive offloading'), then learning becomes shallow, because:

    ...deep learning is a matter of training the brain as much as informing the brain. If we neglect that training by continually outsourcing, we risk shallow competence.

    Even worse, there is a feedback loop embedded in learning, which exacerbates the negative effects of cognitive offloading:

    Without internally stored knowledge, our brain's natural learning mechanisms remain largely unused. Every effective learning technique—whether retrieval practice, spaced repetition, or deliberate practice—works precisely because it engages this prediction-error system. When we outsource memory to devices rather than building internal knowledge, we're not just changing where information is stored; we're bypassing the very neural mechanisms that evolved to help us learn.

    In short, internalized knowledge creates the mental frameworks our brains need to spot mistakes quickly and learn from them effectively. These error signals do double-duty: they not only help us correct mistakes but also train our attention toward what's important in different contexts, helping build the schemata we need for quick thinking. Each prediction error, each moment of surprise, thus becomes an opportunity for cognitive growth—but only if our minds are equipped with clear expectations formed through practice and memorization...

    Learning works through making mistakes, recognising those mistakes, and adapting to reduce those mistakes in future. Ironically, this is analogous to how generative AI models are trained (through 'reinforcement learning'). When students offload learning tasks to generative AI, they don't get an opportunity to develop the underlying internalised knowledge that allows them to recognise mistakes and learn from them. Thus, it is important for significant components of student learning to happen without resorting to generative AI (or other tools that allow students to cognitively offload tasks).

    Now, in order to encourage learning, teachers must provide students with the opportunity to make, and learn from, mistakes. Oakley et al. note that:

    ...cognitive scientists refer to challenges that feel difficult in the moment but facilitate deeper, lasting understanding as “desirable difficulties... Unlike deliberate practice, which systematically targets specific skills through structured feedback, desirable difficulties leverage cognitive struggle to deepen comprehension and enhance retention...

    Learning is not supposed to be easy. It is supposed to require effort. This is a point that I have made in many discussions with students. When they find a paper relatively easy, it is likely that they aren't learning much. And tools that make learning easier can hinder, rather than help, the learning process. In this context, generative AI becomes potentially problematic for learning for some (but not all) students. Oakley et al. note that:

    Individuals with well-developed internal schemas—often those educated before AI became ubiquitous—can use these tools effectively. Their solid knowledge base allows them to evaluate AI output critically, refine prompts, integrate suggestions meaningfully, and detect inaccuracies. For these users, AI acts as a cognitive amplifier, extending their capabilities.

    In contrast, learners still building foundational knowledge face a significant risk: mistaking AI fluency for their own. Without a robust internal framework for comparison, they may readily accept plausible-sounding output without realizing what’s missing or incorrect. This bypasses the mental effort—retrieval, error detection, integration—that neuroscience shows is essential for forming lasting memory engrams and flexible schemas. The result is a false sense of understanding: the learner feels accomplished, but the underlying cognitive work hasn’t been done.

    The group that benefits from AI as a complement for studying is not just those who were educated before AI became ubiquitous, but also those who learn in an environment where generative AI is explicitly available as a complement to learning (rather than a substitute). To a large extent, it depends on how generative AI is used as a learning tool. Oakley et al. do provide some good examples (and I have linked to some in past blog posts). I'd also like to think the AI tutors I have created for my ECONS101 and ECONS102 students assist with, rather than hamper, learning (and I have some empirical evidence that seems to support this, which I have already promised to blog about in the future).

    Oakley et al. conclude that:

    Effective education should balance the use of external tools with opportunities for students to internalize key knowledge and develop rich, interconnected schemata. This balance ensures that technology enhances learning rather than creating dependence and cognitive weakness.

    Finally, they provide some evidence-based strategies for enhancing learning (bolding is mine):

    • Embrace desirable difficulty—within limits: Encourage learners to generate answers and grapple with problems before turning to help... In classroom practice, this means carefully calibrating when to provide guidance—not immediately offering solutions, but also not leaving students floundering with tasks far beyond their current capabilities...
    • Assign foundational knowledge for memorization and practice: Rather than viewing factual knowledge as rote trivia, recognize it as the glue for higher-level thinking...
    • Use procedural training to build intuition: Allocate class time for practicing skills without external aids. For instance, mental math exercises, handwriting notes, reciting important passages or proofs from memory, and so on. Such practices, once considered old-fashioned, actually cultivate the procedural fluency that frees the mind for deeper insight...
    • Intentionally integrate technology as a supplement, not a substitute: When using AI tutors or search tools, structure their use so that the student remains cognitively active...
    • Promote internal knowledge structures: Help students build robust mental frameworks by ensuring connections happen inside their brains, not just on paper... guide students to identify relationships between concepts through active questioning ("How does this principle relate to what we learned last week?") and guided reflection...
    • Educate about metacognition and the illusion of knowledge: Help students recognize that knowing where to find information is fundamentally different from truly knowing it. Information that exists "out there" doesn't automatically translate to knowledge we can access and apply when needed.

    I really like those strategies as a prescription for learning. However, I am understandably biased, because many of the things I currently do in my day-to-day teaching practice are encompassed within (or similar to) those suggested strategies. I'll work on making 'guided reflection' a little more interactive in my classes this year, as I have traditionally made the links explicit for the students, rather than inviting them to make those links for themselves. We have been getting our ECONS101 students to reflect more on learning, and we'll be revising that activity (which happens in the first tutorial) this year to embrace more of a focus on metacognition.

    Learning is something that happens (often) in the brain. It should be no surprise that neuroscience has some insights to share on learning, and what that means for pedagogical practice. Oakley et al. take aim at some of the big names in educational theory (including Bloom, Dewey, Piaget, and Vygotsky), so I expect that their work is not going to be accepted by everyone. However, I personally found a lot to vindicate my pedagogical approach, which has developed over two decades of observational and experimental practice. I also learned that there are neuroscientific foundations for many aspects of my approach. And, I learned that there are things I can do to potentially further improve student learning in my classes.