Tuesday, 16 June 2026

My take on that iPhone-fertility paper

If you've been reading the news over the last week, you may have seen talk about new research linking fertility decline in the US to the release of the iPhone. For example, the New Zealand Herald reported that:

Middlebury College economist Caitlin Myers and her student Ezekiel Hooper tested a hypothesis that smartphones - which emerged with the arrival of the first iPhone in 2007 - might have something to do with it.

Until 2011, iPhones were available from a single US cellular network, AT&T, so they compared US counties that had near-universal AT&T coverage with those that had little or none during those years.

And they found that access to the iPhone correlated with reductions in births by 4.5% to 8% at ages between 15 and 19, and by 3.2% to 6.6% at ages between 20 and 24.

There were also statistically significant but smaller declines among older women.

Other news sources picked up that the research attributed 33 to 52 percent of the decline in fertility to the iPhone's release (see here and here, for example). That result made me sceptical, and my concerns really echo those of Tyler Cowen here:

In 2008, 1.9% is the share of the mobile-subscribing population with an iPhone wireless subscription.  As a percent of all adults that is 1.6%.

In 2009, it is 4.3%.  3.6% of all adults.

In 2010, 6.8%.  5.5% of all adults...

So when the authors talk about diffusion explaining 33–52% of the decline in the general fertility rate among American women 15–44, I still do not get how that is supposed to operate.

If less than six percent of all adults have an iPhone by 2010, how could iPhones reduce fertility by between one-third and half? This requires very large spillovers from a small group of early adopters, and I am not convinced the paper has made those spillovers quantitatively plausible (we'll get to the authors' views on that later).

The research is reported in this NBER Working Paper by Caitlin Myers and Ezekiel Hooper (both Middlebury College). They use data on national wireless broadband coverage at the census block level to categorise US counties into those where less than 10 percent of the population have coverage by AT&T ('control' counties) and those where more than 90 percent of the population have coverage by AT&T ('treated' counties). Their sample includes 1399 'control' counties, and 914 'treated' counties (with 794 counties excluded from the sample). The reason that Myers and Hooper chose AT&T is because AT&T had an exclusive arrangement with Apple for almost the first four years after it was first launched in June 2007. The first Android phones didn't become available until October 2008, and didn't become widespread in the 'control' counties until a year later. So, there was a period where AT&T coverage is a reasonable proxy for the prevalence of iPhones.

Myers and Hooper then compare control counties with treated counties in terms of annual age-specific fertility rates (in five-year age groups). However, they recognise a key problem, which is that the treated and control counties differ in meaningful ways, the most obvious of which is that the treated counties are more urban than the control counties. This is a problem for their analysis because fertility rates have been declining more rapidly in urban areas than in rural areas, and therefore this would lead to overstatement of the measured effect of iPhone coverage on fertility. Specifically, the CDC reports that from 2007 to 2017, the total fertility rate fell by 12 percent in rural counties (many of which will be in the control sample), but by 18 percent in large metro counties (which are almost certainly in the treated sample).

Myers and Hooper try to deal with this problem by re-weighting their data in two ways. The first is by using an "entropy balanced Poisson event study", which effectively re-weights the control counties by giving more weight to those that are most similar to the treated counties in terms of their cross-sectional characteristics at the time of the iPhone launch. The second is by using a "synthetic difference-in-differences estimator", which creates a set of synthetic control counties by re-weighting the control counties so that the time series of fertility most closely matches each of the treated counties.

Using those methods, Myers and Hooper find the results that the news media has picked up. Specifically:

Both estimators imply large, statistically significant declines in births to young women. The post-gestation ATT ranges from −4.5 to −8.0% at ages 15–19 and −3.2 to −6.6% at ages 20–24 (the entropy-balanced Poisson at the lower-magnitude end, SDID at the higher), with smaller effects at older ages. Scaled to the U.S. county universe, these estimates imply the iPhone accounts for between 33 and 52% of the 2007–2011 decline in the general fertility rate. The pattern is similar across race, parity, marital status, and education, with the exception of Black women, for whom we estimate no effect.

The key results are summarised in Figure 3 from the paper (for the entropy balanced Poisson event study):

And in Figure 4 from the paper (for the synthetic difference-in-differences (SDID) estimator):

In both cases, the point estimates from the time before 2008 show no statistically significant difference between treated and control counties, while there is a negative (and increasing) difference between treated and control counties from 2008 onwards. However, notice that in Figure 3 (the first figure above), it seems clear visually that the downward trend starts before 2008, even if it is statistically insignificant. In Figure 4, there is no pre-trend, but remember that in the SDID analysis, the controls are reweighted to replicate the pre-treatment time series of fertility for the treated counties, so there should be no difference in the pre-treatment values by construction.

Myers and Hooper run various robustness checks that address some of the more obvious criticisms of their approach, including sensitivity to the choice of treatment and control cutoffs, using a continuous treatment variable, estimating the model in levels rather than logs, various placebo treatments, and truncating the sample to exclude any contamination from the release of Android phones. Among the placebo tests, they run analyses using Verizon's and Sprint’s pre-2011 coverage, and find no effects. So, their findings are not general to the difference between counties that attract mobile operators and those that don't. They also address the plausibility of the results, noting that:

The iPhone is not a treatment that operates at the individual level. Whether one’s own phone matters likely depends on whether one’s peers have phones; a phone in a friend group full of non-owners is a different intervention than a phone in a group where everyone has one. Spillovers run between phone-owning peers and their non-owning friends, and operate at the level of the group, not just the match: if smartphones reduce friend-group meetups and parties, then matches that would have formed under no-iPhone simply never do—the unformed match is itself the outcome.

That may be so, but the implied size of the spillovers is far larger than is plausible. If, as Cowen suggests, less than 15 percent of the population have iPhones, unless iPhone ownership and the spillovers from iPhone ownership were heavily concentrated among women of childbearing age, the overall effect simply can't be that large.

So, what has gone wrong. The overall approach that Myers and Hooper apply seems valid on the face of it, and re-weighting of controls to better match the treated sample is a common method of causal inference. The problem here is that the weighting is extreme. Myers and Hooper note that, in relation to the entropy balanced Poisson event study approach:

Balance comes at a cost: equalizing the marginal means requires putting high weight on a small number of treated-like controls. The Kish (1965) effective sample size of the balanced control pool is 77 out of 1,399 raw controls...

So, basically the analysis is heavily skewed towards a comparison between the treated counties and a small number of control counties, which are the control counties that are most like the treated counties (which also makes them the most unlike the other control counties). Those control counties are doing a lot of the work in this analysis.

There are also other possible differences between urban and rural counties that are approximately contemporaneous with the release of the iPhone. First among these is the 'Great Recession' and the housing slump around that time. Myers and Hooper do control for county-level changes in house prices, so that reduces concerns about contamination from that source. They also control for unemployment and poverty rates, which might pick up differential changes in labour markets. However, there was a change in contraceptive availability that directly affects young women's fertility, which is expanded access to the 'morning after pill' for 17-year-olds, although that occurred in 2009. Finally, after the 'Great Recession' there was a slowdown in Hispanic immigration, which might have affected urban and rural counties differently. Given that Hispanic immigrants tend to have relatively higher fertility than the US-born, so if the decline in Hispanic immigration was greater in control counties (and especially for the small number of heavily weighted control counties), then that might explain the effect. Myers and Hooper control for county Hispanic population share. However, it would be better to control for Hispanic population share among the age group that is being analysed, or to control for changes in Hispanic immigration.

This paper has certainly gotten people talking. Smartphones might be part of the story of why fertility has declined, but I don't think that we should uncritically take away from this study that the iPhone caused half of the decrease in US fertility between 2007 and 2011. More likely, it had a modest effect (if at all), and is confounded by a number of other changes that differentially impacted rural and urban US counties at around the same time.

[HT: Marginal Revolution]

Read more:

Sunday, 14 June 2026

Book review: How to Think Like an Economist (Roger Arnold)

If you ask many economics teachers, they will tell you that they really want to teach students how to think like an economist. However, in amongst the supply and demand curves, the elasticities, and the multiplier effects, the core goal of teaching students to actually think like an economist gets lost, overwhelmed by a lot of do this stuff like an economist. So, it's interesting when a book actually tries to get behind the models and teach the underlying thinking.

That's what the 2005 book How to Think Like an Economist, by Roger Arnold, tries to do. Arnold explains that:

To teach students how economists think, we must tell them stories. While we tell the stories, we must point out just what is "running through the economist's head." In this book, I have tried to focus on what goes through the economist's head as he or she looks at the world.

And mostly, Arnold is successful, although it isn't always the case that every economist would think in the same way. For example, Arnold makes a big deal about ratios. And while ratios are important, I for one am never thinking about the ratio of marginal benefit to marginal cost, when I can simply think about which one is larger. The ratio is redundant.

There is a lot to like about this book, and Arnold surfaces some of the more surprising (to non-economists) ways that economists would think about problems. For example, who but an economist would even ask the question, "What is the optimum amount of hitting yourself in the head with a hammer?". And yet, Arnold treats us to a consideration of exactly that question in the second chapter.

Having said that, I felt like the book was quite uneven. Although Arnold warns readers at the beginning that the book is intended as a companion to a more thorough textbook economics treatment, and gives examples of how the chapters can be mixed and matches with various styles of economics courses, a reader reading the book chapter by chapter is constantly confronted with terminology that is left unexplained until later chapters. This was most jarring in the case of the 'equilibrium price', which came with no explanation of what equilibrium is, nor why the equilibrium price is important at all. Similarly, Arnold uses the term ceteris paribus first, without explaining what it means. And if you want to understand how the economist thinks, understanding the meaning of ceteris paribus (which, for the record, means holding all else constant) is kind of important.

Arnold also betrays a lack of understanding of some real-world context. Blackjack is provided as an example of a zero-sum game played between the players. However, blackjack in the real world is not at all like that. Blackjack players are playing against the house, not against each other. One blackjack players win does not in itself entail a loss to the other players.

So, although understanding how economists think is important, and I applaud the effort and the approach that this book takes, I feel like it fell a bit short of the mark. This book is long out of print, but that might not be such a bad thing.

Friday, 12 June 2026

This week in research #130

Here's what caught my eye in research over the past week:

  • Fumarco and Groero (open access) describe a Stata package that reduces a dataset down to just those variables that are used in a particular .do file (useful for creating replication packages while minimising data bloat)
  • Cox (open access) describes three Stata commands that creates a new dataset of the quantiles, percentiles, or confidence intervals for a particular variable or result (if you've ever needed to do this, you will know how frustrating it is)
  • Yarashov, Baryshnikova, and Kakhkharov find that military expansion exerts a significant negative impact on fertility across 15 post-Soviet countries between 1992 and 2022
  • Chatterjee, Dimova, and Ojha (open access) find, using a correspondence study in urban India, that equally qualified single mothers are much less likely to receive interview callbacks than unmarried women without children, married women, and married mothers
  • Charness et al. (with ungated earlier version here) provide a convincing argument of the virtues of lab experiments in economics
  • In a companion piece, Gneezy examines the principles of experimental economics
  • Wang finds that China's policy to limited young peoples’ access to online video games did not produce detectable effects on academic performance, study time, or health
  • Pritchett and Viarengo (open access) demonstrate that ad hoc poverty lines, including the World Bank's poverty lines, are far too low to be plausible candidates for an inclusive global poverty line

Wednesday, 10 June 2026

Is it working from home, and not generative AI, that is harming the prospects of young workers?

There is growing evidence that the labour market for young workers is challenging. Graduates are finding it more difficult to get jobs after graduation. Several research papers have noted that generative AI may be to blame (see this post, for example), with one research paper referring to the changes in the labour market as seniority-biased technological change (see this post).

But the challenge with trying to attribute changes in the labour market to the rise of generative AI is that there are other contemporaneous changes affecting the labour market as well. One of those changes is the rise of working from home (as I noted in yesterday's post). Working from home may reduce the prospects for junior workers in part because it costs more to supervise and monitor them when they are working from home. Junior workers also benefit from on-the-job learning when they work with other people, and that on-the-job learning is less effective when they work from home. Combining those two effects, working from home reduces the incentive for employers to hire junior workers.

This new working paper by Peter Lambert (University of Warwick) and Yannick Schindler (Ellison Institute of Technology, Oxford) tries to disentangle the effects of generative AI and working from home on employment of younger workers. They use data from Revelio Labs that is made up of monthly matched employer-employee records collected from résumés (predominantly from LinkedIn) to construct a measure of the junior share of all new hires. They also use data from Lightcast on the near-universe of online job postings across thousands of online job sites and other websites. They use the Lightcast data to construct a measure of the share of job postings that require three or fewer years of experience. Their data from both sources covers the period from 2017 to 2025, and includes four countries: the US, the UK, Canada, and Australia.

Lambert and Schindler then use that data, along with measures of 'exposure to generative AI' and 'exposure to working from home' at the occupation level, in a difference-in-differences strategy. That means that they essentially compare the change in the share of junior job hires (or job postings) between occupations that are more or less exposed to generative AI (or working from home). Their main results are neatly summarised in Figure 3 from the paper:

Panel (a) shows that the junior share of new hires decreases significantly in jobs that are more exposed to working from home, from 2023 onwards (the black line). When they also control for exposure to generative AI (the red line), the effect of working from home barely changes. In contrast, Panel (b) shows that the junior share of new hires also decreases significantly in jobs that are more exposed to generative AI, from 2023 onwards (the black line). However, when they also control for exposure to working from home (the blue line), the effect of generative AI becomes much smaller and statistically insignificant. The results are similar for the share of job postings requiring three or fewer years' experience, as shown in Panels (c) and (d) of the figure.

The size of the effects are quite large too. A one-standard-deviation increase in exposure to working from home reduces the junior share of new hires by about two percentage points, and the share of job postings requiring three or fewer years' experience by 1.5 percentage points.

Lambert and Schindler conclude that, based on their results, working from home is a better predictor of the decline in junior hiring than generative AI. Given potential benefits of working from home, they are reluctant to recommend policies against working from home, instead noting that:

...micro-level adjustments may be required to help firms adapt their organizational practices, so as to enjoy the benefits of WFH [work from home] arrangements while simultaneously managing the development of early-career talent.

Seen alongside the negative mental health impacts of working from home (as noted in yesterday's post), this should give us further pause for thought. However, it is worth noting that even if working from home is a better predictor of reductions in junior hiring than generative AI within their model, that doesn't let generative AI off the hook entirely. Since both trends are happening at the same time, reducing working from home might not eliminate the negative impacts on junior hiring, but instead make generative AI appear more important as an explanation. Lambert and Schindler note early in their paper that it is often the same occupations (white-collar occupations) that are most exposed to both working from home and generative AI. Given that, perhaps Lambert and Schindler's recommendation for micro-level changes in organisational practice may be the best mitigation strategy available to us.

[HT: Marginal Revolution]

Read more: