Sunday, 14 June 2026

Book review: How to Think Like an Economist (Roger Arnold)

If you ask many economics teachers, they will tell you that they really want to teach students how to think like an economist. However, in amongst the supply and demand curves, the elasticities, and the multiplier effects, the core goal of teaching students to actually think like an economist gets lost, overwhelmed by a lot of do this stuff like an economist. So, it's interesting when a book actually tries to get behind the models and teach the underlying thinking.

That's what the 2005 book How to Think Like an Economist, by Roger Arnold, tries to do. Arnold explains that:

To teach students how economists think, we must tell them stories. While we tell the stories, we must point out just what is "running through the economist's head." In this book, I have tried to focus on what goes through the economist's head as he or she looks at the world.

And mostly, Arnold is successful, although it isn't always the case that every economist would think in the same way. For example, Arnold makes a big deal about ratios. And while ratios are important, I for one am never thinking about the ratio of marginal benefit to marginal cost, when I can simply think about which one is larger. The ratio is redundant.

There is a lot to like about this book, and Arnold surfaces some of the more surprising (to non-economists) ways that economists would think about problems. For example, who but an economist would even ask the question, "What is the optimum amount of hitting yourself in the head with a hammer?". And yet, Arnold treats us to a consideration of exactly that question in the second chapter.

Having said that, I felt like the book was quite uneven. Although Arnold warns readers at the beginning that the book is intended as a companion to a more thorough textbook economics treatment, and gives examples of how the chapters can be mixed and matches with various styles of economics courses, a reader reading the book chapter by chapter is constantly confronted with terminology that is left unexplained until later chapters. This was most jarring in the case of the 'equilibrium price', which came with no explanation of what equilibrium is, nor why the equilibrium price is important at all. Similarly, Arnold uses the term ceteris paribus first, without explaining what it means. And if you want to understand how the economist thinks, understanding the meaning of ceteris paribus (which, for the record, means holding all else constant) is kind of important.

Arnold also betrays a lack of understanding of some real-world context. Blackjack is provided as an example of a zero-sum game played between the players. However, blackjack in the real world is not at all like that. Blackjack players are playing against the house, not against each other. One blackjack players win does not in itself entail a loss to the other players.

So, although understanding how economists think is important, and I applaud the effort and the approach that this book takes, I feel like it fell a bit short of the mark. This book is long out of print, but that might not be such a bad thing.

Friday, 12 June 2026

This week in research #130

Here's what caught my eye in research over the past week:

  • Fumarco and Groero (open access) describe a Stata package that reduces a dataset down to just those variables that are used in a particular .do file (useful for creating replication packages while minimising data bloat)
  • Cox (open access) describes three Stata commands that creates a new dataset of the quantiles, percentiles, or confidence intervals for a particular variable or result (if you've ever needed to do this, you will know how frustrating it is)
  • Yarashov, Baryshnikova, and Kakhkharov find that military expansion exerts a significant negative impact on fertility across 15 post-Soviet countries between 1992 and 2022
  • Chatterjee, Dimova, and Ojha (open access) find, using a correspondence study in urban India, that equally qualified single mothers are much less likely to receive interview callbacks than unmarried women without children, married women, and married mothers
  • Charness et al. (with ungated earlier version here) provide a convincing argument of the virtues of lab experiments in economics
  • In a companion piece, Gneezy examines the principles of experimental economics
  • Wang finds that China's policy to limited young peoples’ access to online video games did not produce detectable effects on academic performance, study time, or health
  • Pritchett and Viarengo (open access) demonstrate that ad hoc poverty lines, including the World Bank's poverty lines, are far too low to be plausible candidates for an inclusive global poverty line

Wednesday, 10 June 2026

Is it working from home, and not generative AI, that is harming the prospects of young workers?

There is growing evidence that the labour market for young workers is challenging. Graduates are finding it more difficult to get jobs after graduation. Several research papers have noted that generative AI may be to blame (see this post, for example), with one research paper referring to the changes in the labour market as seniority-biased technological change (see this post).

But the challenge with trying to attribute changes in the labour market to the rise of generative AI is that there are other contemporaneous changes affecting the labour market as well. One of those changes is the rise of working from home (as I noted in yesterday's post). Working from home may reduce the prospects for junior workers in part because it costs more to supervise and monitor them when they are working from home. Junior workers also benefit from on-the-job learning when they work with other people, and that on-the-job learning is less effective when they work from home. Combining those two effects, working from home reduces the incentive for employers to hire junior workers.

This new working paper by Peter Lambert (University of Warwick) and Yannick Schindler (Ellison Institute of Technology, Oxford) tries to disentangle the effects of generative AI and working from home on employment of younger workers. They use data from Revelio Labs that is made up of monthly matched employer-employee records collected from résumés (predominantly from LinkedIn) to construct a measure of the junior share of all new hires. They also use data from Lightcast on the near-universe of online job postings across thousands of online job sites and other websites. They use the Lightcast data to construct a measure of the share of job postings that require three or fewer years of experience. Their data from both sources covers the period from 2017 to 2025, and includes four countries: the US, the UK, Canada, and Australia.

Lambert and Schindler then use that data, along with measures of 'exposure to generative AI' and 'exposure to working from home' at the occupation level, in a difference-in-differences strategy. That means that they essentially compare the change in the share of junior job hires (or job postings) between occupations that are more or less exposed to generative AI (or working from home). Their main results are neatly summarised in Figure 3 from the paper:

Panel (a) shows that the junior share of new hires decreases significantly in jobs that are more exposed to working from home, from 2023 onwards (the black line). When they also control for exposure to generative AI (the red line), the effect of working from home barely changes. In contrast, Panel (b) shows that the junior share of new hires also decreases significantly in jobs that are more exposed to generative AI, from 2023 onwards (the black line). However, when they also control for exposure to working from home (the blue line), the effect of generative AI becomes much smaller and statistically insignificant. The results are similar for the share of job postings requiring three or fewer years' experience, as shown in Panels (c) and (d) of the figure.

The size of the effects are quite large too. A one-standard-deviation increase in exposure to working from home reduces the junior share of new hires by about two percentage points, and the share of job postings requiring three or fewer years' experience by 1.5 percentage points.

Lambert and Schindler conclude that, based on their results, working from home is a better predictor of the decline in junior hiring than generative AI. Given potential benefits of working from home, they are reluctant to recommend policies against working from home, instead noting that:

...micro-level adjustments may be required to help firms adapt their organizational practices, so as to enjoy the benefits of WFH [work from home] arrangements while simultaneously managing the development of early-career talent.

Seen alongside the negative mental health impacts of working from home (as noted in yesterday's post), this should give us further pause for thought. However, it is worth noting that even if working from home is a better predictor of reductions in junior hiring than generative AI within their model, that doesn't let generative AI off the hook entirely. Since both trends are happening at the same time, reducing working from home might not eliminate the negative impacts on junior hiring, but instead make generative AI appear more important as an explanation. Lambert and Schindler note early in their paper that it is often the same occupations (white-collar occupations) that are most exposed to both working from home and generative AI. Given that, perhaps Lambert and Schindler's recommendation for micro-level changes in organisational practice may be the best mitigation strategy available to us.

[HT: Marginal Revolution]

Read more:

Tuesday, 9 June 2026

Two new studies on who works from home, and its mental health impacts

The pandemic caused a massive rise in working from home and now, even though lockdowns are long since over and many workers have returned to the workplace, we are beginning to understand working from home (WFH) a lot better. Two new studies have recently added to our understanding.

The first is this article by Cevat Giray Aksoy (European Bank for Reconstruction and Development) and co-authors, published in the AEA Papers and Proceedings (ungated earlier version here). They use data from the monthly US Survey of Working Arrangements and Attitudes, limiting their data to the period from January 2024 to December 2025, and document three facts about WFH. First, employees are more likely to work from home if they work for a younger firm, and peaks among those working for employers that were founded in the height of the pandemic, in 2020.

Second, employees are more likely to work from home if they work at a firm with a younger CEO. Specifically:

Firms led by CEOs under 30 have an average of 1.4 WFH days per week, compared with 1.1 days at firms led by CEOs who are 60 or older.

That doesn't seem like a lot, but an additional 0.3 days per week is a little more than three working weeks per year of WFH for those working for the youngest CEOs compared with those working for the oldest. However, this relationship between CEO age and WFH appears to be partly explained by the fact that younger CEOs are more likely to be leading younger firms. When Aksoy et al. put both CEO age and firm age in the same regression model, only firm age remains statistically significant. It is a similar story for CEO gender, which is initially statistically significant, but since female CEOs tend to be younger and to be CEOs of younger firms, CEO gender isn't statistically significant once those other variables are controlled for.

Third, the self-employed are much more likely to work from home. Specifically:

Self-employed workers report two to three times as many WFH days per week as wage and salary employees, depending on employer size. Compared to wage and salary employees, the self-employed are more than three times as likely to work in a fully remote capacity.

This last result is not entirely surprising, given that the self-employed typically have a lot more flexibility over scheduling. And, the self-employed may be the type of people who most value flexibility as well.

The second new article is this one by Natalia Emanuel (Federal Reserve Bank of New York), Emma Harrington (University of Virginia), and Amanda Pallais (Harvard University), published in the prestigious journal Science (open access). They look at the mental health impacts of WFH, using US data from a variety of sources, and a difference-in-differences approach. This involves comparing occupations that are more or less amenable to WFH, between the time before the pandemic and the time after the pandemic. They refer to the occupations that are more amenable to WFH as 'remotable'.

Emanuel et al. first document the dramatic rise of WFH:

The pandemic led to a large increase in remote work for those in remotable jobs, such that by 2024, workers in remotable jobs spent 31.1% of workdays fully remote, whereas people in nonremotable jobs spent only 8.9% fully remote... Those in remotable jobs experienced a 17.9 percentage point (pp) differential increase in fully remote work...

They then show that this rise is associated with more time spent alone:

Along with spending less time in the office, workers in remotable jobs spent more time working alone after the pandemic, logging 1.2 more work hours alone per day relative to nonremotable workers (58.0% increase; P < 0.0001).

Even for those of us who are introverts, more alone time may not necessarily be a good thing. Emanuel et al. are concerned about how WFH and working alone affects mental health. Their main outcome variable is the Kessler (K-6) Psychological Distress Scale, which is:

...based on how often in the past 30 days the respondent felt worthless, hopeless, restless, nervous, that everything is an effort, or so sad that nothing could cheer them up...

Their main source of data is the Panel Study of Income Dynamics covering the period from 2011 to 2023 (from which they exclude the pandemic years 2020 and 2021). Analysing that data, they find that:

Between the pre-and postpandemic periods, mental distress increased for everyone, but it increased significantly more for those in remotable jobs...

Among those in remotable jobs, there was a 0.3 unit increase in the K-6 distress score relative to an average score of 3.0 before the pandemic (standard deviation change = 0.08; P = 0.063) in the Panel Study of Income Dynamics (PSID). In the National Health Interview Study (NHIS), we found the same 0.3 unit deterioration (P = 0.007). We saw deterioration in each of the six subcomponents of the K-6 distress scale: feeling worthless, hopeless, restless, nervous, that everything is an effort, and so sad that nothing can cheer them up...

Importantly, the deterioration in mental health is concentrated among people living alone, which is consistent with the idea that WFH affects mental health through increasing social isolation. Emanuel et al. also find that people in remotable jobs are more likely to seek help from a mental health practitioner, and take relatively more prescription medications for mental health conditions such as anxiety or depression. These changes aren't simply the result of greater flexibility allowing more time to be devoted to health care generally, as there was no change in visits to the doctor and no change for other prescription medications such as statins.

Finally, Emanuel et al. looked at whether the rise of generative AI, rather than the increase in WFH, might explain the results (an important check, given the paper I will blog about tomorrow). They find that results from the same analysis, but substituting an AI occupational exposure index in place of the 'remotability' index, are not statistically significant.

Now, many workers are very keen on WFH - as noted in this post, about half of Australian workers would be willing to give up some salary in order to work from home. Why would people choose more WFH if it may worsen their mental health? Of course, a rational worker would weigh up the benefits and costs of WFH, and may decide that the mental health costs are more than offset by other benefits. However, Emanuel et al. point to another related possibility, which is:

...that the benefits of remote work (e.g., skipping a daily commute) are immediate and salient, whereas the costs of remote work (e.g., frayed connections with co-workers) take time to materialize.

So, a rational worker may be essentially weighing up benefits that occur today, against uncertain costs that may occur sometime in the future and therefore should be discounted (in the same way that we should discount future cashflows in a financial analysis). In that sort of exercise, where the mental health costs are discounted, it is more likely that workers would choose to work from home. They would be even more likely to do so if they are quasi-rational and heavily discount the future, as I note in the first week of my ECONS102 class. In that case, the mental health costs would be heavily discounted. Finally, maybe workers are simply unaware of the mental health costs of WFH. If that is the case, then an information intervention might be helpful in improving mental health among workers who would otherwise be WFH. In the meantime, this research suggests that the post-pandemic rise in WFH may have contributed to some part of the growing mental health crisis, especially through increased time spent alone.

[HT: Marginal Revolution for the Emanuel et al. article]

Read more:

Monday, 8 June 2026

Maybe hosting the Olympics just shuffles income around a country, rather than increasing it

There is a large, and still growing, literature on the economic impact of large sporting events (see this post, and the links at the end of it, for some examples). My conclusion from that body of research is that large sporting events are expected to generate large economic impacts (based on studies conducted before the event), but generally the actual economic effects are small or non-existent (when measured after the event). However, the studies are typically based on a single event, or a small number of events. Are the typical null results driven by a small sample size and if so, would a larger and more diverse sample demonstrate different results?

That is the question essentially underlying this 2021 article by Matthias Firgo (Austrian Institute of Economic Research), published in the journal Regional Science and Urban Economics (ungated earlier version here). Firgo looks at the effect of the Olympic Games (both summer and winter) on regional GDP per capita in the host region (not GDP per capita in the whole host country, or only in the host city), using data from the 1992 Winter Olympics in Albertville to the 2020 Summer Olympics in Tokyo. Importantly, Firgo uses a control group made up of regions with cities that had been shortlisted by the International Olympic Committee (IOC) to host in the same year, but were unsuccessful (more on that a bit later).

Because of data limitations, Firgo focuses on GDP per capita as a percentage of national GDP per capita - essentially a relative measure of wellbeing at the regional level. Using this measure, he finds that for the Summer Olympics:

...regional per capita GDP significantly increases by 3.6 %-points (3.3 %-points) relative to national per capita GDP in the year of the event (the year before the event).

In other words, the host region’s GDP per capita rises by around 3 to 4 percentage points relative to national GDP per capita in the lead-up to the event. In contrast, there is only very weak evidence of any persistent effect of the event on regional GDP per capita, and the Winter Olympics (which are a smaller event, and typically held in smaller cities) had no significant effects. The positive effect of the Summer Olympics on regional GDP per capita in the years immediately before the event is consistent with increasing spending on infrastructure (including sporting, transport, hospitality, and cultural infrastructure) in the lead-up to a substantial event. That there is no persistent effect is fairly consistent with the other research on the economic impact of large events.

However, there are two other things to take away from this research. First, if anything these results might overstate the impact of successfully bidding for the Olympics. Whether a potential host city's bid is successful or not is not a random event. Cities that are more likely to be successful hosts should, at least in theory, be more likely to be selected as hosts. So, the control group is an imperfect comparator for the treatment cities in a way that is likely to bias the results. If successful hosts were cities that the IOC believed were already on an upward trajectory at the time of the Olympics, then that would bias upwards the estimated impact of the event. Of course, such foresight from the IOC would have to be executed seven years before the event (which is when the hosts are typically selected), but nevertheless there is potential for upward bias. That said, shortlisted cities are still likely to be a better comparison group than all non-host cities, since they had already demonstrated some capacity and willingness to host.

Second, these results tell us more about relative effects within the host country, rather than absolute economic impacts. They show that the GDP per capita increases in the host region relative to the rest of the country. Given that the overall economic impact is small to negligible, as are population changes arising around the event (both of which many other studies have shown), a large part of the relative increase in GDP per capita in the host region must arise from a combination of increased GDP per capita in the host region, and decreased GDP per capita in other regions in the same country. Effectively then, hosting the Summer Olympic Games simply shuffles income around a country in the lead-up to the games, with the host region benefitting while other regions are negatively impacted. Then after the event, there is a return to the normal inter-regional distribution of incomes.

The Olympic Games is a large spectacle - an opportunity for national celebration as we watch sporting heroes compete to win medals. The evidence still suggests that the Games are not a source of sustained economic growth, and that any short-run gains may be highly localised rather than national, and some of those gains come at the expense of other regions.

Read more:

Saturday, 6 June 2026

Book review: The Nvidia Way

The biggest news story about stock markets over the last three years has probably been the dramatic rise of technology stocks, and particularly those related to AI. And among those stocks, one of the standout performers has been computer chip maker Nvidia. The success of Nvidia now hides the fact that the company had many close calls, where it was literally on the verge of closing down. That is one of the key facts that I learned from reading The Nvidia Way, by Tae Kim.

Kim was previously a technology columnist at Bloomberg, and he tells us he wrote several comments critical of Nvidia. Nevertheless, Nvidia allowed him to have unprecedented access to Nvidia staff, but more importantly, to CEO Jensen Huang. And that is important, because the story of Nvidia, and 'the Nvidia way' is undeniably a story of Jensen Huang. Huang wasn't the only founder of Nvidia, but he has been the face of the company, the driving force behind its successes, and the person most responsible for picking up the pieces after its frequent failures. Kim writes that:

In all my years covering business, as a consultant, an analyst, and now as a business writer, I have never met anyone quite like Jensen. In the field of graphics, he is a pioneer. In the harsh technology market, he is a survivor. And he has been a CEO for more than thirty years - marking him, as of this writing, the fourth-longest currently-serving CEO in the S&P 500...

Kim clearly has a lot of respect for Huang, and this shines through the whole book. Even where other authors would press on the more negative aspects of Huang's personality, such as his ultra-competitive nature, Kim is more measured:

Jensen was so competitive that he challenged other employees even when he was at a disadvantage. In high school, CFO Geoff Ribar had ranked among the top fifty chess players in the country. His boss, however, would not accept that someone else was better than him...

Jensen attempted to close the gap between his and Ribar's chess skills through brute-force learning. He memorized chess openings and sequences of moves, so that he would control the board. Yet Ribar round his playing style predicable... Every time he lost, Jensen would swipe his arm across the board, knocking over the pieces, and storm away. He would sometimes later insist on a rematch on the ping-pong table. Ribar graciously accepted, knowing Jensen was purposely shifting the competition onto more favorable territory.

It is worth noting that Huang was a champion table tennis player. His competitiveness has clearly served him well in business, and is one of the key factors in Nvidia's success.

So, what is 'the Nvidia way', after which the book is titled? Kim notes that it has several characteristics, including the hiring raw talent especially through aggressive hiring methods, its emphasis on retaining high-quality employees, its strong focus on a culture of excellence, the high demands it in turn places on those employees, and the leadership of Huang himself. Not all of these characteristics, especially not Huang, could necessarily be replicated at other companies. However, there is a lot that budding leaders could nevertheless learn from this book.

Having said that, there is one element where the book could have explored deeper. There were many occasions where Nvidia was close to failure, including following the release of one of its very first chips. Obviously, Nvidia is wildly successful as a company now. But should we interpret the company's success in spite of its challenges as the result of good management, culture, and hard work, or should it be interpreted as luck? In other words, how much of Nvidia's observed success is simply survivor bias? Kim obviously sides with attributing the company's success to its own good efforts, but it would have been good for him to turn a more critical eye to just how lucky they had been at key points.

Despite that gripe, I really enjoyed this book. I distinctly remember buying an Nvidia GEForce graphics card many years ago. Kim does a great job of bringing to life all of the characters and their contributions to the story, as well as the key events in the life of the company. If you're interested in understanding the rise of Nvidia, this book is recommended.

Friday, 5 June 2026

This week in research #129

Here's what caught my eye in research over the past week:

  • Araya et al. (with ungated earlier version here, but in Spanish) evaluate the impact of using the CORE textbook (which I use in my ECONS101 class) in introductory microeconomics in Uruguay, in comparison with a conventional textbook, finding no systematic differences in pass or dropout rates between the two courses, but that students using CORE are significantly more likely to believe that it contributed to their academic and professional development
  • Baker et al. (with ungated earlier version here) study the staggered rollout of unionisation across Canadian universities between 1970 and 2022, and find that unionisation compressed salaries, with wages at the bottom of the unconditional distribution increasing by roughly 10 percent, while wages at the top were unaffected
  • Baker et al. (but a different Baker, and with ungated earlier version here) provide a detailed summary of different types of difference-in-differences (DiD) research designs and their associated estimators, as well as discussing covariates, weights, handling multiple periods, and staggered treatment (this will be a highly cited resource, given the number of studies that use DiD for causal inference)

Wednesday, 3 June 2026

This research doesn’t convincingly show that biodiversity is good for business

I was interested to read this article in The Conversation last month by Paul Griffin (University of California, Davis) and Martien Lubberink (Victoria University of Wellington), mainly because of statements like this:

...firms operating in areas with richer biodiversity are measurably more productive.

I thought, that's interesting. This might be a good example to use in class next trimester to illustrate the difference between correlation and causation. After all, the authors may be correct that firms operating in areas with richer biodiversity are more productive (correlation), but that doesn't mean that biodiversity increases productivity (causation).

And then I read the paper that The Conversation article was based on. And at that point, I decided that I shouldn't use this as an example of the difference between correlation and causation, because even the correlations that they find are shaky at best.

The approach that Griffin and Lubberink take is to look at the relationship between measures of business output and measures of biodiversity. Their measure of business output is sales or gross profit, taken from Stats NZ's Longitudinal Business Database. They generally interpret this as a measure of productivity. And that is the first problem with the paper. Sales can be interpreted as gross revenue, and in some contexts sales may be used as a rough measure of gross output. But sales are not a good measure of productivity, and is not a good measure of the economic value created by a business. The more appropriate measure would be value added, or at least something closer to profit. To see why, consider two firms that both produce a product that sells for $1,000 per unit, and both firms sell 1,000 units per month. Both firms have sales of $1 million per month. Firm A buys the product wholesale at a cost of $800 per unit, then adds a mark-up. The value added of Firm A is $200,000 per month. Firm B buys raw materials of $200 per unit, adds labour of $300 per unit, and then sells the product. The value added of Firm B is $500,000 per month. Firm B creates a lot more economic value than Firm A, and yet measured by sales they are the same. Sales are therefore a poor measure of productivity. Gross profit is less problematic, because it subtracts at least some intermediate input costs, but even gross profit is not a pure measure of value added or productivity.

As a measure of biodiversity, or more accurately as a set of proxies for biodiversity-related conditions and pressures, Griffin and Lubberink use a variety of indicators that they call 'biodiversity abundance markers' (which for some reason they use the acronym BDAs to represent). They aggregate data from a range of sources for their various BDAs (which I will discuss in further detail below), with the data at the SA2 level (SA2s are geographical areas approximately the size of suburbs in urban areas, and larger in rural or remote areas). They note that:

For each SA2, we define a vector of “biodiversity abundance markers” (BDAs), where each ranges from 0 to 100. We denote these ranks as BDA1, BDA2, … , BDAm. We then assign them to an SA2 and, therefore, to the businesses and employees in the same SA2. For a given BDA in an SA2, BDAm = 0 means complete biodiversity loss (high pressure from biodiversity loss) for marker mBDAm = 100 (low pressure from biodiversity loss) is equivalent to an SA2 with an undisturbed or fully intact natural state.

So far, so good. The only issue with that approach is that the measures of biodiversity don't have a natural interpretation, because they are just an index. But we often work with indices - you just need to be cautious about how you interpret the magnitude of the effects. Griffin and Lubberink start by showing the correlation between each of their BDA measures and their measures of business output.

However, then they want to create an overall index of biodiversity, and to do this they:

...multiply each BDA by its SA2 land area and denote the result as an empirical proxy for the natural capital (n) of an SA2 applicable to the businesses operating therein.

Remember that the BDA is an index, bounded between 0 and 100, and it has no natural interpretation in terms of magnitude. So, multiplying the index by the land area of the SA2 is not meaningful, because the BDA is not a measured biodiversity stock per square kilometre. I guess it might make sense if you wanted to calculate a weighted average index, where the weights are based on SA2 land areas, but that isn't what Griffin and Lubberink are doing. Their approach is problematic because it mechanically causes the measured biodiversity to be higher in rural areas ceteris paribus (holding all else equal), where SA2s are larger, and lower in urban areas, where SA2s are smaller. Within urban areas, ceteris paribus it causes higher measured biodiversity in industrial and commercial areas, where SA2s are larger, and lower in residential areas, where SA2s are smaller.

Griffin and Lubberink then aggregate their index-multiplied-by-land-area measures in various ways. The aggregation approach they adopt is fine, but when you aggregate numbers that are not individually meaningful, the result is not meaningful either.

But let's take a step back, because there is another problem. Griffin and Lubberink pitch their analysis as based on a Cobb-Douglas production function. That is fine - a Cobb-Douglas function is a way of relating inputs to output. We already know that their measure of output is faulty. Their inputs are also faulty. Their three-factor Cobb-Douglas function includes inputs of financial capital, human capital, and natural capital.

Griffin and Lubberink measure human capital as the number of employees working in business units in an SA2. That is really a measure of labour input, not human capital. To measure human capital (as well as labour), it would be better to also consider the education level of those employees, since more educated (not to mention more experienced) employees have more human capital. So, their measure is unlikely to pick up the important variation in human capital across SA2s, but it will pick up differences in labour input. But as a measure of combined labour and human capital, their measure will bias downwards measured human capital in urban areas, where education levels are highest, and bias upwards measured human capital in rural and remote areas, where education levels are lowest.

Griffin and Lubberink measure financial capital by the number of business units operating in an SA2. That is not financial capital. That is business density. The relationship between the number of firms and financial capital is not straightforward. An SA2 might have lots of small firms that have low aggregate financial capital, or one large firm that has a lot of financial capital.

Finally, we come back to natural capital, which is measured as noted above. However, some of the measures of biodiversity that Griffin and Lubberink use are better suited than others as a measure of natural capital. The definition of capital is important here - capital is stored up resources that can be used to produce things. Financial capital is stored up savings that can be used in the future. Human capital is stored up education and experience that can be used in the future. So, capital is a stock. It is not a flow.

Now, let's consider the BDA measures one-by one. The first (BDA1 - Land Use) is "1 - the ratio of the number of agriculture and forestry business (primary industry) units in an SA2 to the total number of business units in an SA2". This is not really a measure of land use, because it isn't measured in terms of land. The relative size of the businesses is not taken into account, so many small farms would increase this measure compared to fewer large farms. It is also difficult to see how this is a measure of biodiversity.

The second measure (BDA2 - Infrastructure) is "1 - the rank of the number of business units in an SA2 to the land area in km2 of an SA2 divided by the total number of SA2 observations". It is difficult to understand why this BDA is measured as a rank, whereas BDA1 was not. It is also difficult to see how the number of firms is a measure of infrastructure, or how it relates to biodiversity. This measure will tend to be lower in urban areas, where many small businesses are clustered, than in rural areas. So, this is likely just a measure of urbanicity, not a measure of infrastructure or biodiversity.

The third measure (BDA3 - Mining) is "1 - ratio of the number of mining business units in an SA2 to the total number of business units in an SA2", Like BDA1, this doesn't account for the size of the mines. If you have a small quarry, that counts the same in this measure as the enormous Martha Mine in Waihi. It is more plausibly a measure of (negative) biodiversity than the other measures though. Or at least it would be, if the size of the businesses were taken into account.

The fourth measure is climate change in two forms (BDA4a - Climate Change, and BDA4b Heat Spell Anomaly), which are measured as "the sum of the presence of a heat spell, cold spell, rain spell, or wind spell in an SA2 divided by 4" and "the rank of the heat spell anomalies in an SA2 divided by the total number of SA2 observations". They measure heat spells, cold spells, rain spells, and wind spells as the number of days on which the measured variable (temperature, rain, or wind) falls above (or below, for cold spells) the 'rolling mean 95th percentile' (it isn't clear what the term 'rolling mean 95th percentile' actually means). It isn't clear why adding those four up makes any sense, but perhaps you could just label them weather anomalies. In the second form of this measure, like BDA2 it isn't clear why the rank is used when the actual number of heat spells could be used instead. Again, this isn't really a direct measure of biodiversity, but to the extent that weather anomalies impede biodiversity, it may be a reasonable proxy.

The fifth measure (BDA5 - River Diversity) is "River condition × 100, where River condition = Percentage of insect and related species in an SA-located river compared to all possible species". This is probably the clearest actual biodiversity measure in the paper. However, it is still a narrow one, because although it captures the presence of insect and related species in rivers, it doesn't capture biodiversity more generally. It also doesn't consider the abundance of species. 

The sixth measure (BDA6 - Drinking Water) is "An indicator of the average improvement (higher BDA) or deterioration (lower BDA) in drinking water quality in a region based on periodic water testing". This measure is not a stock, it is a flow. It is a change over time, which gives no indication of the stock available for businesses to use in production. Since Griffin and Lubberink are interested in natural capital as a stock, it would have been better to use the level of drinking water quality, rather than the change in drinking water quality over time. This measure also has problems of reverse causality. Griffin and Lubberink use their measures as if they are business inputs. However, water quality is likely an output of business. Consider a dairy farm that reduces the water quality in a nearby stream. They have the causal relationship backwards when this variable is included in the analysis.

The seventh measure (BDA7 - Plant Diseases) is "1 - percentage of plant diseases in an SA-unit compared to all possible plant diseases". Let's put aside the impossibility of measuring "all possible plant diseases". This might be a useful measure of (the lack of) biodiversity, but it would be better to directly measure plant biodiversity, rather than proxying for it by plant diseases.

The eighth measure (BDA8 - Matauranga) is "Percentage of SA2 population of Māori descent". This is a socio-cultural proxy for relationships with nature, not a measure of biodiversity.

The ninth measure (BDA9 - Population Density) is "1 - the rank of the population density in an SA2 divided by the total number of SA2 observations". Again, it isn't clear why the rank is used here, rather than actual population density. Also, like BDA2 this is a measure of urbanicity, not biodiversity.

The tenth measure (BDA10 - Possum Count) is "1 - the rank of the possum count in an SA2 divided by the total number of SA2 observations". Again, it isn't clear why the rank is used here, rather than some standardised measure of the actual possum count, or possums per land area. It is an indicator of biodiversity though, since more possums would typically mean fewer of other species.

Finally, the eleventh measure (BDA11 - Non-Drought Probability) is "1 minus the ratio of the number of drought weather events in an SA divided by the sum of the number of drought plus non-drought weather events in an SA2". It's not clear what a 'non-drought weather event' is, or why this is a sensible measure. This measure is probably correlated with the climate change measures in BDA4 in any case.

So, across the eleven (or twelve, if you treat the two BDA4 measures as separate) BDA measures, there are only three that are really measures of biodiversity, and there are a few that are likely to meaningfully correlated with biodiversity. The issue is not that every variable must be a perfect direct measure of biodiversity. Empirical research often relies on proxy measures. The issue is that the interpretation should match the proxy. A variable that measures urbanicity, business density, ethnicity, or weather anomalies may be related to biodiversity, but it is not itself biodiversity. If those variables are then combined into a single measure of 'natural capital', the interpretation becomes difficult. The estimated relationship may reflect biodiversity, but it may also reflect a mix of urbanicity, industry mix, infrastructure, climate, or demographic composition. Conflating urbanicity with biodiversity is an especially clear problem for Griffin and Lubberink's analysis, given that they multiply their BDA measures by SA2 land area when constructing their overall measure of natural capital, as I noted earlier.

Finally, Griffin and Lubberink attempt to exploit what they describe as a quasi-natural experiment. The idea is that a number of government policy changes in 2016 and 2017 were intended to improve the environment. If these policies successfully increased biodiversity, then the relationship between biodiversity and business output should become stronger after those policies were implemented. However, this is not a particularly convincing identification strategy. The policies were national, so there is no obvious untreated control group within New Zealand. The test is essentially asking whether the relationship between natural capital and business output changed after 2016 or 2017. But many other things could also have changed around the same time, including macroeconomic conditions, industry conditions, investment decisions, business confidence, and local economic trends. Moreover, the policies themselves may have affected firms through channels other than biodiversity, not least through expectations about future policy changes. That makes it difficult to interpret any post-2016 or post-2017 change as evidence that biodiversity caused higher business productivity. This part of the analysis instead shows that the estimated association between natural capital and business output is not stable over time, and that might be due to policy changes or any number of other reasons.

There are other issues that I could pick out as well, such as not including SA2 fixed effects in their analysis (so that time-invariant differences between SA2s are not controlled for). To be fair, including SA2 fixed effects would absorb much of the cross-sectional variation in biodiversity that the authors are trying to use. But that is exactly the problem, because without SA2 fixed effects, the estimates may reflect other time-invariant differences between SA2s, and not differences in biodiversity.

The overall takeaway from this paper is not that correlation is not the same as causation, it is that if you want to demonstrate correlation, you first need to use the right data in the right way. Biodiversity might be good for business. Business might be good for biodiversity. This research doesn't convincingly estimate the relationship between biodiversity and business output.

Tuesday, 2 June 2026

Genshin Impacts on Chinese trade

During the pandemic, when people were isolated at home, some people discovered a passion for sourdough. Others picked up a book. But plenty of people got (more) heavily into gaming. In late 2020, Genshin Impact was launched into that environment, and immediately exploded in popularity despite being released by a Chinese gaming studio little known to Western gamers. The interesting thing about Genshin Impact is that it doesn't 'Westernise' its Chinese foundations, and through that it may have opened a window to Chinese culture that many Western gamers wouldn't otherwise have noticed.

What effect, if any, did this have? That is essentially the question that this new article by Tianyu Wang (Jiangsu Provincial Academy of Social Sciences) and co-authors, published in the journal China Economic Review (sorry I don't see an ungated version online), tries to answer. Specifically, they look at the impact on Chinese exports, using a difference-in-differences (DiD) strategy. This involves comparing trade between China and countries with more, or less, exposure to Genshin Impact, between the period before and after its release (which they set as October 2020, the first full month after the open beta of Genshin Impact was released on 28 September 2020). Their data is monthly export data from China to other countries, from the UN Comtrade database.

However, there are a couple of oddities with the analysis. First, Wang et al. control for a variety of variables in their regression model. However, two of the variables they control for are the log of GDP and the log of GDP per capita. Because their model is a log-linear model, this means that they are unnecessarily controlling for GDP twice. To see why, consider this equation:

lnY = a + blnX + cln[X/Z]

You can think of X as GDP and Z as population, so X/Z is GDP per capita. Since ln[X/Z] is equal to [lnX - lnZ], that equation is really:

lnY = a + blnX + clnX - clnZ = a + [b+c]lnX - clnZ

So, the coefficients on both GDP and GDP per capita are not directly interpretable and a bit awkward. The coefficient on log GDP per capita in their model is actually the negative of a coefficient on log population, while the coefficient on log GDP is incorrect. Fortunately though, this just adds unnecessary complexity to their model. It doesn't bias the coefficients in the rest of the model.

Second, Wang et al. use Google Trends data as the treatment variable. This seems appropriate, because Google Trends will pick up differences in cross-country interest in Genshin Impact. Specifically, they create a Google Trends Index (GTI) that captures the search intensity for their term of interest. However, in their main analysis, they don't use a GTI based on searches for 'Genshin Impact'. Instead, they use a GTI based on searches for 'Sony'. Their explanation for that is:

There is evidence indicating that Sony and miHoYo maintain a very close relationship, and that Sony has played an important role in the global promotion of Genshin Impact.

They also say that:

...regressing China's exports directly on Genshin Impact GTI is highly endogenous...

Both of those statements may be true, and Wang et al. provide a variety of evidence in support of the close relationship between Sony and Genshin Impact. However, they don't provide similar evidence for why searches for 'Genshin Impact' would be endogenous in a way that searches for 'Sony' wouldn't. One possibility is that they are worried that search intensity for 'Genshin Impact' is correlated with countries' pre-existing closeness to China, or with pre-existing interest in Chinese cultural products. A difference-in-differences strategy, especially one that controls for country-level differences in pre-treatment trade, should already be controlling for those issues. However, time-varying shocks that are correlated with both Genshin Impact searches and Chinese exports after 2020 would remain. For example, the Genshin Impact GTI would also capture changes in favourability of views towards China that change for reasons other than Genshin Impact. Using the 'Sony' GTI may therefore reduce one problem, but it also introduces another, since Sony searches could reflect many things unrelated to Genshin Impact or China.

Fortunately, Wang et al. do report results based on the GTI for 'Genshin Impact' in their online appendix, and the results are not so different from what they get with the 'Sony' GTI. Apparently, this was suggested by one of the journal reviewers. Honestly, I think the results based on the 'Genshin Impact' GTI are the more plausible results, so I'm going to focus on them. And in those results, reported in Table D6 in the online appendix, they find that following the open beta release of Genshin Impact, every one-unit higher GTI for 'Genshin Impact' for a country is associated with a 0.215 percent increase in exports from China to that country. Unfortunately, they don't report the summary statistics for the 'Genshin Impact' GTI, so it is difficult to interpret. It is also difficult to interpret because the GTI is a normalised measure of search intensity relative to all Google searches in a given country and period. However, for comparison, the effect using the 'Genshin Impact' GTI is slightly larger than what they report for the 'Sony' GTI, which is a 0.186 percent increase in exports for each one-unit higher 'Sony' GTI.

Either way, the results suggest that countries where Genshin Impact was a bigger phenomenon experienced larger increases in exports from China than countries where Genshin Impact was less impactful. Wang et al. then turn to the mechanisms that might explain this change, using Pew Global Trends and Attitudes data. They report that:

Although we do not find evidence that Genshin Impact improved favorable perceptions of China, we do find evidence that it reduced unfavorable perceptions. This effect is primarily driven by a decline in mild aversion; there is no significant change in strong aversion. This result is intuitive—individuals who strongly dislike China are unlikely to revise their views solely because of a video game.

They also find that media narratives became more positive following Genshin Impact's release, for countries where the 'Sony' GTI was higher. However, this result is only suggestive as it was statistically insignificant.

One interesting final aspect of the paper is that Wang et al. used data on cultural distance to further explore the results, finding that:

...as bilateral cultural distance increases, the promotional effect of Genshin Impact on China's exports significantly diminishes.

So, Genshin Impact had a larger trade impact for countries with greater cultural similarity to China. That suggests that, while it might be an interesting narrative to suggest that Genshin Impact exposed the world to China, improving perceptions of China and increasing trade, the effect was actually concentrated on the countries that were already most similar to China.

This paper presents some interesting findings. However, it clearly isn't the last word on whether the international sharing of cultural products can have tangible effects on international trade, beyond their effects on the trade of the cultural product itself. It would be interesting to see if there are similar impacts for Korean cultural products, for example, or Bollywood movies (or Nollywood movies, for that matter).

Monday, 1 June 2026

Turkish inflation drives consumers to incur extreme shoe-leather costs

Inflation imposes costs on people. One of the costs of inflation is that it gives people strong incentives to spend time and effort avoiding higher prices. They can do that by reducing their cash holdings, searching harder for low prices, or, in extreme cases, travelling to shop elsewhere. When inflation is high, and prices are increasing rapidly, consumers have a strong incentive to spend a lot of time doing these things. Economists call these shoe-leather costs, because when consumers have to walk around a lot of stores in order to compare prices, their shoes wear out. At least, that's a literal explanation of the term. In an age where prices are published online, the actual act of 'walking around to compare prices' is a lot easier on the shoes. Or is it? An extreme example has been playing out recently, as reported in Bloomberg last November (paywalled, but you can find an ungated version here):

Almost every month, Cihan Citak gets into his car, passport in hand, and sets off from Istanbul to Alexandroupolis, a Greek seaside city 40 kilometers (25 miles) from the Turkish border. After a roughly four-hour drive, he walks the crowded aisles of the local supermarket, filling his cart with wine, cheese and other groceries that cost a fraction of what they do back home...

Cross-border retail has become routine for many who found that Turkey’s surging food prices and stronger lira make Greece a cheaper alternative for everyday purchases. The trend, while not new, is accelerating: 6% of all Turks crossing the border to Greece in the first nine months of the year were on a shopping run, the highest share of overall travelers since at least 2012, data from the country’s statistics agency show.

When inflation causes people to drive four hours in order to find lower prices, you know the shoe-leather costs must be high. The inflation rate in Türkiye is over 30 percent. That isn't hyper-inflation, but it is very high. For comparison in New Zealand, the inflation rate spiked at about 7 percent just after the pandemic, but that was the highest it had been in over 30 years. Inflation more recently has been between 2.5 and 3.5 percent, which is higher than the Reserve Bank's mandate to keep inflation between one and three percent in the medium to long term.

All of that is to say that Türkiye’s much higher inflation creates much stronger incentives for consumers to incur shoe-leather costs to avoid higher prices than is currently the case in New Zealand

[HT: New Zealand Herald, also paywalled]

Friday, 29 May 2026

This week in research #128

Here's what caught my eye in research over the past week:

  • Ruggles tests Richard Easterlin's argument that the economic and social prospects of a generation are influenced by the size of the cohort relative to adjacent cohorts, and finds using US data from 1910 to 2040 that the theory fits the data well for the period from 1940 to 1980 but fails in later decades, although baby boomers exiting the labour force will likely lead to increases in wages in the future
  • de Bondt and Sun (with ungated earlier version here) use ChatGPT to classify activity sentiment scores from Purchasing Managers’ Index (PMI) news releases, then use those scores to 'nowcast' GDP, finding that on average, out-of-sample forecast accuracy improves by about 20% apart from the two most recent years
  • Skali et al. (open access) find that better-looking Swiss politicians are not more prone to rent-seeking through interest group affiliations, and do not deviate more from their voters' preferences
  • Jin, Karim, and Schulze (open access) find that Islamist terror attacks created significant negative abnormal returns in American and European markets, but the stock market effects of other terror attacks were almost nil

In other news, I wrote a quick take on the New Zealand Budget as part of The Conversation's coverage this week. That article also has a drop-down menu at the bottom that summarises the key Budget announcements in each area

Thursday, 28 May 2026

Try this: Taxed

Today was Budget Day in New Zealand. The government revealed its forecasts of future revenue and its spending plans. There is a good summary of this on The Conversation (disclaimer: I wrote the blurb at the top of that summary).

The problem with the Budget is that the numbers are large, and it is difficult to get a good sense of the relative magnitudes. How do you interpret $1.18 billion in spending on rail network renewal and upgrades?

One of my recent students, Tyler Dunseath, created the Taxed website, that uses your income to work out how much tax you pay (weekly, fortnightly, monthly, or annually), then apportions that tax to the various categories of spending from the government accounts. So, for example, if your weekly income is $1000 before tax, and you don't adjust for ACC, KiwiSaver, or student loan repayments, you pay $165.77 in tax. Of that, $56.79 goes to social security and welfare, $37.40 goes to health, $24.25 goes to education, and so on. The results give you a better sense of how taxes are distributed.

Of course, there are a number of caveats, the biggest of which is that government services are a bundle, and while Taxed might make it seem like you could in theory say, "I don't want to pay $0.32 per week for international peacekeeping", it doesn't work that way. Moreover, a lot of government spending is on services that are public goods and therefore non-excludable, so even if you could opt out of paying for them, you would still receive the benefits of them.

Second, government receives some income that is earmarked for particular purposes. For example, the fuel excise tax is earmarked for the National Land Transport Fund. So, your income tax isn't distributed in exact proportion to the government's spending on different categories, because less of your income tax goes towards transport.

Third, the site doesn't account for the taxes we pay on goods and services (GST, or excise taxes on alcohol, tobacco, or fuel), or the user charges we pay.

With those caveats in mind though, Taxed is a pretty cool way of showing how the government's spending is distributed, and in a way that most people are more likely to understand than the millions or billions of dollars cited in the budget.

Enjoy!

[HT: Tyler Dunseath]

Wednesday, 27 May 2026

Is it better to have a more educated mayor?

It seems somewhat self-evident that having a more educated mayor would be better than having a less educated mayor. However, whether education is a positive attribute for a mayor really depends on whether, and to what extent, more educated mayors act differently than less educated mayors. Do they spend more, or less? How do they spend the public budget?

This new article by Alessio Mitra (University of Kent), published in the European Journal of Political Economy (ungated earlier version here) directly addresses the second question - how does mayoral education affect public finance? Mitra uses data from municipal elections in Italy over the period from 2000 to 2015, focusing on municipalities with a population of less than 15,000 (because larger municipalities use different electoral rules). He defines a more educated mayoral candidate as one with a university degree, and a less educated mayoral candidate as one without a degree.

Mitra applies a regression discontinuity design (RDD), which involves comparing municipalities that narrowly elected a more educated mayoral candidate over a less educated candidate with similar municipalities where the more educated candidate lost to the less educated candidate. In very close elections, the identity of the winner is plausibly as-good-as random, provided there is no manipulation around the threshold related to the education of the candidates. In other words, since the difference between getting 50.01 percent of the vote and getting 49.99 percent of the vote is essentially random, the education of the winning mayoral candidate is basically determined randomly in these close elections between candidates with different education levels. With that assumption in mind, observed differences between the municipalities where a more educated candidate won with those where they lost can be attributed to the difference in mayoral education.

Mitra's dataset includes more than 18,000 mayoral elections, of which 1211 have a margin of victory of less than five percent (which he defines as a close election, and includes in the analysis). He looks at the differences in public expenditure, initially focusing on changes in the share of spending devoted to operational expenses (or 'current expenditure' as he terms it) or public investment. In this, Mitra finds that:

When an educated mayor is elected by chance, public investment rises by 3 percentage points of total expenditure compared to a less educated counterpart.

Digging down into the allocation of that public investment, he finds that:

...educated mayors allocate an additional 1 percentage point of total expenditure to education investment, accounting for one-third of the overall increase in public investment.

And going a bit deeper than that:

Among education investments, immovable assets dedicated to nurseries receive the largest increase in resources.

Consistent with Italy’s balanced budget requirement on municipalities, there is no significant change in fiscal deficit. That means that the additional spending devoted to public investment must mean a corresponding reduction in operational expenditure. Mitra doesn't really dig into that at all.

What we take away from this paper is that more educated mayors devote more spending to education. In the Waikato Economics Discussion Group today, we discussed what mechanisms might underlie this difference, which is something that Mitra didn't explore. Perhaps more educated mayors see more value in education. After all, they invested more in their own education than a less educated mayor did. However, that's not entirely consistent with spending more on public investment in early childhood education.

A second possibility is that more educated mayors have a lower intrinsic discount rate, increasing their willingness to make long-term investments, both in their own education and in the education of their citizens. This is more consistent with devoting spending to public investment in early childhood education.

A third possibility is that more educated mayors may be better at the administration of public investment, such as project approvals, capital budgeting, grants, or procurement. This means that they have greater capacity for public investment projects. However, that greater capacity wouldn't necessarily be more apparent for public investment in education, or early childhood investment.

However, an intriguing but speculative fourth possibility is that more educated mayors understand that public investment can be used strategically to affect demographics. Many municipalities in Italy are facing extreme population ageing and/or declining populations. Mitigating (but probably not reversing) those population changes may be possible through creative policy. If the municipality invests in early childhood education, that may make the municipality more attractive for young parents to relocate to, and may reduce cost pressures that hamper fertility. The problem with this as an explanation is that it isn't clear that these trends and policy solution would be more apparent to a more educated mayor than to a less educated one.

The second possibility seems to me like the most promising. However, exploring the reasons why more educated mayors spend more on public investment, particularly in education, is a promising exercise for future research.

One last point is that the effects are actually quite modest. The total budget for a municipality of 15,000 population would be around €15-25 million per year (based in part on this and this, both in Italian, but see also here for public finance data for all Italian municipalities). A reallocation of three percentage points to public investment represents up to an additional €750,000 per year. And if one-third of that is spent on public investment in education, that is an additional €250,000 per year. It's not nothing, but it's certainly not building multiple new schools. Maybe it's an additional small school building per year.

So, is it better to have a more educated mayor? This research suggests yes, but that relies on a normative view that more spending on public investment, particularly in education, is overall a good thing. However, the size of the effect doesn't suggest transformational change, and we don't really know what the trade-offs are in terms of what categories of operational spending were reduced. A university degree does not necessarily make someone a better mayor, and this paper cannot tell us whether more educated mayors have better preferences, longer time horizons, or simply greater administrative capacity. What it does show is that who gets elected can change not just how much is spent, but what kind of future a municipality chooses to invest in.

Monday, 25 May 2026

Does the future of higher education look more like a mentoring pyramid scheme?

In response to my recent post about the future of higher education and one-on-one mentoring, one of my students from last year, Yunze, got in touch via email to offer a potential solution:

...I wonder whether it is possible to set a clear academic threshold within each discipline. If students who reach this threshold could mentor upper‑middle‑level students, while professors spend only a small amount of time supervising the overall direction, the system might become more sustainable. However, I suspect this could harm the interests of the top students, since they might otherwise use that time to further advance their own academic achievements, and If [sic] they fail to successfully train students with real research ability, it would likely damage both the university’s reputation and the professor’s own reputation.

You know, I think Yunze is right on the money here. Consider the problems I outlined in the earlier post: (1) the signalling value of education is falling due to generative AI; (2) a one-on-one mentoring approach may be a solution; but (3) one-on-one mentoring doesn't scale due to limited faculty time. If one-on-one mentoring is not conducted between faculty and students, but works more like a pyramid mentoring model, then this might actually work, not just for students, but for faculty and for universities as well.

So, let's think it through. But first, remember that the mentoring model I introduced in the earlier post is not simply a model of small classes, where senior students perform limited teaching roles, such as tutoring. This is a model of genuine mentoring, where the mentor encourages the mentee to become a builder, in the words of Auren Hoffman. A builder creates things, and it is the act of building, and the learning alongside that, which will be a durable signal to future employers. In relation to mentoring, I said in that post that mentors should do the following for their mentees:

Teach them to be builders. Encourage them to create things. Work with them and chart a path forward for their success.

If faculty provide one-on-one mentoring to a small number of senior students, then that makes better use of faculty time than them mentoring hundreds of first-year students. The senior students can then each mentor several second-year students, who in turn can then mentor several first-year students. [*] In this model, faculty time is targeted at the senior students, where the impact of faculty on student employment outcomes may be greatest.

Students benefit from helping junior colleagues to become builders, where the signalling value may remain even in the face of generative AI. Even better, mentoring provides student mentors with an opportunity to build - they may be able to point employers to the success of their mentees as an example of their building, talking also about what went wrong in the mentoring relationship, and what they learned from the experience.

In this mentoring pyramid model, universities retain a key role, but that role becomes very different. Universities essentially become a platform, connecting students with mentors - first-year students with second-year mentors, second-year students with senior student mentors, and senior students with faculty mentors. In the terms of my earlier post, the university runs their own OnlyStudents platform.

Of course, this platform role creates a new problem for universities. If mentoring works mainly as a way of matching students with mentors, then the market may not need eight OnlyStudents platforms in New Zealand, or thousands of OnlyStudents platforms worldwide. A small number of large platforms could have a big advantage in that case - more students attract more mentors, more mentors improve the quality of matching, and better matching attracts still more students. Those network effects could create a winner-take-all dynamic, in which universities would struggle to differentiate themselves simply by running their own mentoring platforms, and where a single surviving OnlyStudents platform might be the ultimate outcome. However, that conclusion depends on the strength of the network effects. If effective mentoring also depends on institutional trust, disciplinary reputation, local employer connections, pastoral care, or an in-person community, then universities may retain some defensible advantages. Geography alone probably won’t be enough, especially if online mentoring is close to being as effective as in-person mentoring, but local connections might still matter. So the question for universities is not just whether they can build their own OnlyStudents, but whether they can attach that platform to something that a larger, more generic OnlyStudents cannot easily replicate.

Universities may also retain a role in the initial and ongoing training of mentors. Since each student, and each faculty member, will need to be a mentor to one or more others lower down in the pyramid, they will need to understand how to mentor. That means universities would not simply be matching students with mentors. They would also need to train mentors, monitor the quality of mentoring relationships, and intervene when mentor-mentee relationships are not working well. Moreover, the adoption of a mentoring pyramid model is likely going to change who the most successful students (and faculty members) are. The top students do not necessarily make the best mentors (or the best tutors, as I have learnt across years of coordinating tutors in my first-year papers). Good mentoring requires a specific skill set, but it is those skills that may also demonstrate the quality of the student as a builder - a signal of high quality for employers.

A further point about the pyramid mentoring model is that it likely requires a strong filtering effect to be financially viable. Since each faculty member can only mentor a limited number of senior students, and each of those senior students can only mentor a limited number of second-year students, who in turn can only mentor a limited number of first-year students, each level of the pyramid probably needs to be somewhat wider than the levels above it. To achieve that, student progression needs a strong filter, limiting the number of students who progress from first-year to second-year, and from second-year to senior.

Let's consider some simple numerical examples that illustrate why filtering is needed. If each faculty member mentors five senior students, and each senior student mentors five second-year students, who each mentor five first-year students, then the pyramid contains 155 students per faculty member - five senior students, 25 second-year students, and 125 first-year students. A model where each faculty member’s salary is covered by fees from 155 students, setting aside any contribution to central university costs, seems likely to be financially viable to me. However, in this model only one-fifth of students could be allowed to progress each year. That means the model would also need some form of orderly exit for students who are filtered out - perhaps an exit qualification, or a pathway into a non-mentored track. The problem is that both options may provide negative signals about the student who is filtered out.

If all students were to progress, then that would require each student to mentor at most one student at the level below. Keeping five senior students mentored by faculty, then the pyramid would contain 15 students per faculty member - five senior students, five second-year students, and five first-year students. That system would be much less likely to cover the cost of faculty time. So, it's unlikely that the pyramid mentoring model would be viable to run without some form of filtering - perhaps not as extreme as only one-fifth of students progressing each year, but clearly not all students could progress every year.

So, to return to my conclusion from the previous post, the current mass higher education model still looks increasingly fragile, but perhaps one or a few universities might be able to navigate their way through. However, the survivors are likely to be first-movers or fast followers in developing a platform market strategy that leverages a pyramid mentoring model. This model is still going to cost students a lot, and the filtering effect would make higher education more elitist as well.

And thanks to Yunze for inspiring this post with his perceptive email comments.

*****

[*] For simplicity, I'm assuming a three-year higher education degree structure, as we have in New Zealand. For a four-year degree structure, you would of course need to add an additional level.

Read more:

Sunday, 24 May 2026

Book review: The Corporation

I just finished reading Joel Bakan's 2004 book The Corporation. The book (and the accompanying documentary film (which is available on YouTube) outline the pathology of the corporation, given the centrality of corporations to modern life. On that point, Bakan writes that:

Today, corporations govern our lives. They determine what we eat, what we watch, what we wear, where we work, and what we do. We are inescapably surrounded by their culture, iconography, and ideology. And, like the church and the monarchy in other times, they posture as infallible and omnipotent, glorifying themselves in imposing buildings and elaborate displays. Increasingly, corporations dictate the decisions of their supposed overseers in government and control domains of society once firmly embedded within the public sphere.

This is Bakan at his most sweeping, and not always at his most convincing. It's clear that we have far more agency in our dealings with corporations than Bakan intimates. But nevertheless, corporations are and have been a big influence on consumer and government decision-making over many decades. That is why, despite being over twenty years old, the book retains its currency. The underlying incentive problems for large corporations have not really changed in that time, even if the particular corporations that might be the target of criticism may have.

Bakan's book develops a portrait of the corporation and its influence that is more than a caricature based on worn-out tropes. His work is based on exhaustive interviews (many of which can be seen in the documentary film) and research. And Bakan avoids the temptation, which many authors of similar books seem not to, to pin the blame solely on neoliberalism, laissez-faire economics, or economics in general. Instead of relying on such lazy cliches, Bakan provides a reasoned argument for questioning the place of the corporation in the modern economy.

One part of the book I appreciated in particular was the section talking about corporate social responsibility (CSR). I hadn't appreciated that CSR initiatives dated back as far as the end of World War I, where they were referred to as 'New Capitalism'. Bakan really takes issue with the double standards that corporations play, which is ably demonstrated by this passage:

Take the large and well-known energy company that once was a paragon of social responsibility and corporate philanthropy. Each year the company produced a Corporate Responsibility Annual Report; the most recent one, unfortunately its last, vowed to cut greenhouse-gas emissions and support multilateral agreements to help stop climate change. The company pledged further to put human rights, the environment, health and safety issues, biodiversity, indigenous rights, and transparency at the core of its business operations, and it created a well-staffed corporate social responsibility task force to monitor and implement its social responsibility programs... The company, which was consistently ranked as one of the best places to work in America, strongly promoted diversity in the workplace. "We believe," said the report, "that corporate leadership should set the example for community service."

That corporation was... Enron. Bakan delivers the punchline flawlessly.

This book is also more than a simple polemic against the evils of corporations. Bakan also considers potential solutions that could bring corporate behaviour more in line with social goals, and with more sincerity and depth than Enron clearly displayed. Bakan considers the potential for regulation, but is skeptical about it due to the risks of regulatory capture. In that, he rightly refers to the work of the economist George Stigler. Bakan finally alights on the prospects of charter revocation laws - laws that would allow the government to terminate a corporation. This is, obviously, a far greater penalty than has ever been imposed on a large corporation. Nevertheless, the rules do exist in many countries. 

Bakan's focus on charter revocation is interesting, but seems like an disproportionate response. If we consider the corporation as a person, which is a position that Bakan explicitly critiques, then charter revocation is the equivalent of a death sentence. While some may think that the worst actions of corporations deserve a penalty at that end, the innocent shareholders of the corporation would no doubt disagree. Aside from charter revocation, Bakan also notes that there are several things that governments can do, including improving the regulatory system, strengthening political democracy, creating a robust public sphere, and challenging international neoliberalism. Alongside that, greater penalties for senior managers and directors of corporations that break the rules would likely be an improvement, since that would focus more directly on changing the incentives of the decision-makers whose actions lead to corporate wrongdoing.

I enjoyed reading this book, especially as I didn't feel the need to be constantly defending economics in my mind while reading it. Sadly, Bakan's book could have been written today as many of the problems that he outlines are just as apparent with corporations in the 2020s as they were in the 2000s. There has been plenty of CSR language since then, as well as the rise of environmental, social, and governance (ESG) initiatives, but it is hard to see much of this as more than the same low-level box-ticking exercise that Bakan critiques. There has been far less movement on the deeper institutional reforms that Bakan favours, so it is likely that the problems will be just as apparent for corporations in the 2040s. It doesn't have to be that way, and anyone who believes in change would do well to read this book as a starting point.

Saturday, 23 May 2026

Does the future of higher education look more like one-on-one mentoring?

It almost seems a cliche to say that generative AI is both the greatest threat, and the greatest opportunity, for higher education. That doesn't make the statement any less true. And there are many commentators who are trying to work out what happens next for higher education. I think one of the best is Hollis Robbins, whose Substack Anecdotal Value is well worth subscribing to.

Robbins was recently interviewed by Jay Caspian Kang of The New Yorker (paywalled, but you can find an ungated version here) about the future of higher education. The whole interview is worth reading, but I want to highlight this bit in particular, where Robbins says:

I was in Austin, Texas, a couple of times in March with a bunch of twenty-five-year-old billionaires. This is what they’re looking at. Instead of having the credential from the institution, why not have the credential from the professor? If you have a Hollis Robbins education, what would that signal? What would that credential mean as opposed to a degree from a university? There was some conversation about what that would look like, and one guy at the end of the dinner said, “Instead of OnlyFans, it’s like OnlyProfessors.”

The correct analogy here would be that it would be OnlyStudents, not OnlyProfessors (since the name contains the audience, not the performer). However, Robbins makes a good point. Higher education is, in part, an exercise in signalling (as I've noted before here and here). In fact, Bryan Caplan argued in his book The Case Against Education (which I reviewed here) that one third of the benefit of higher education is signalling (the other two-thirds is made up of genuine learning, socialisation, and transferable skills). 

The diploma that a student receives at the end of their higher education journey is a signal to employers of the student's quality as a future employee. The signal is credible to the employer because it is costly for the student to obtain, and costly in such a way that low-quality students wouldn't attempt the signal. University reputation matters here, because it is an assurance of the second of those conditions - low-quality students wouldn't attempt the signal of a Harvard degree, firstly because they wouldn't gain admission to Harvard in the first place. So, part of the signal comes from getting into Harvard. Second, low-quality students wouldn't attempt the signal of a Harvard degree because passing courses at Harvard is hard (or, at least, harder than at many other universities). The quality of the education at Harvard has traditionally been higher than elsewhere.

In the interview, Robbins makes a further important point, which is that we've spent the last several decades making higher education a commodity. Students studying a degree in a particular subject learn the same things, often using the same teaching materials, the same textbook, and the same style of assessment, regardless of which university they go to. That means that the quality of signal that arises from the education itself has reduced over time, meaning that most of the value of the signal arises from the admission process. Once a student has been admitted to Harvard, their signal is in place, and the further signalling from their education is lower than for comparable students in years past.

Robbins argues that generative AI is accelerating and expanding this commodification of education. Since generative AI has access, through its training corpus, to a large store of human knowledge, to add value over and above generative AI a professor must be a true expert in a very specific subfield. For students, learning a subject from a true expert still retains value, because generative AI cannot as easily replicate the learning that would occur from the expert. At that point, the university becomes less important as a mediator of education.

In other words, Robbins is arguing that the university could be disintermediated, with individual professors issuing credentials instead of universities. Who needs a Harvard degree, when you could have a Hollis Robbins degree? And since Robbins is an expert (in African American sonnet tradition, she says), the signal retains high value. However, that is true only to the extent that employers find the Hollis Robbins degree a credible signal.

That brings me to this Substack post by Auren Hoffman. Hoffman also argues that generative AI has diminished the value of the higher education signal. However, Hoffman argues for a different solution, this time from the perspective of the graduating student:

you have to show you can add value. that is it. that is the only thing.

the test the smart hiring manager applies in 2026 is simple. can you learn something on your own? can you finish what you started? can you do what you said you would do? these were always the skills that mattered. the difference is they are now the ONLY skills that matter, because the credential stopped doing the screening.

if you have a few years of experience, your resume can show you have these skills. if you are a new grad, you have to show what you have created and built.

Hoffman is overstating things a little, as university degrees are unlikely to disappear overnight. However, his critique is still important, as is the implication he draws. Hoffman argues that graduates (or young people, generally, since the solution doesn't depend on a student attending a university or completing a degree) need to become builders:

the most valuable thing a 22 year old can do in 2026 is create something. an app. a screenplay. a side business. an internal tool you wrote for a club you were in. a dinner series. a script that automates something annoying. a website. a chrome extension. a sculpture. a dance party. a discord bot. a substack with 32 readers and a real point of view. anything that moved from idea to working.

will the thing make money? probably not. that is not the point.

the point is that you taught yourself something. you finished it. you can describe what you learned, what broke, what you fixed, why you made the calls you made. that story is the new resume.

every hiring manager would rather interview a 22 year old with a launched app and a github full of weird side projects than a 22 year old with a 3.9 GPA from a top 50 school. it is not close. when one candidate has tangible evidence of what they can ship and the other has a transcript, the transcript will lose every time.

According to Hoffman, the best signal for young people to be sending in future is that they can build. Our students will ultimately be more successful if they can show off their skills (both technical and transferable skills) by building something. That is a signal that is costly, and costly in such a way that low-quality students will not attempt it. The signal is credible, and for the most part it retains currency even in the face of generative AI. Indeed, if the student builds something while effectively leveraging generative AI, then the signalling value to employers may be even greater. The key distinction is between the student using generative AI as a tool and the student using it as a substitute for doing the work. A student who can explain what they built, what broke, what they learned, and why they made the choices they made, is still sending a costly signal. A student who simply lets generative AI build for them is not. Employers would likely see through the latter pretty quickly.

Hoffman's idea isn't exactly new. When I think about some of my best students over the past two decades, they tend to be those that built something either while they were studying, or immediately after. For some, this was their own small business or entrepreneurial activity. For others, it was writing and publishing a research paper. Those were challenging tasks that set them apart from other students - a clear signal of quality. What Hoffman is essentially saying is that, with the signal from higher education itself being removed, the only remaining signal of quality is the signal from being a builder.

Where does that leave higher education staff? I think we can combine Robbins's and Hoffman's ideas, and chart a path forward. We don't need to start an OnlyStudents, and issue our own degrees, but we do need to cultivate closer relationships with our best students. Teach them to be builders. Encourage them to create things. Work with them and chart a path forward for their success. In other words, be a mentor.

Universities are absolutely going to hate this. Mentoring is not an activity that can be offered at scale. For example, there are simply not enough hours in the day for me to individually mentor all of the 350-plus students in my first-year economics class. Nor is mentoring easy to timetable, measure, standardise, or reward under current academic workload models. Universities are built around papers, credit points, learning outcomes, assessment rubrics, and student evaluations, all of which can be offered at scale. One-on-one mentoring doesn't work so well in that system. For example, postgraduate supervision already sits awkwardly within the system, with it being unclear whether it counts as teaching, or research. Nevertheless, mentoring may be one of the few ways that higher education can continue to offer something that is both valuable and difficult for generative AI to replicate.

If generative AI significantly reduces the signalling value of university education, and if students increasingly use it to avoid genuine learning, then the current mass higher education model looks increasingly fragile. Moreover, what remains is going to cost students a lot more. If having a high-quality mentor who can encourage a student to build is the path to their future employment, then it may be worth it. After all, high-quality signals are costly. That aspect of education, at least, won't have changed.

[HT: Marginal Revolution, for both the Hollis Robbins interview and the Auren Hoffman post]

Read more: