Tuesday, 24 March 2026

Evidence that artificial intelligence is increasing the impact, but narrowing the scope, of research

There is growing evidence of positive impacts of generative artificial intelligence on productivity. This includes productivity in research (see this post, for example), including my own. However, some have questioned whether increasing research productivity comes at a cost of narrowing the scope of research.

So, I was interested to read this article by Qianyue Hao (Tsinghua University) and co-authors, published in the prestigious journal Nature (ungated earlier version here) late last year. They look at the impact of AI tools (not limited to generative AI) on the productivity of researchers and the quality of research. Specifically, they look at authors publishing in six representative fields: biology, medicine, chemistry, physics, materials science, and geology, across three 'eras': (1) the 'machine learning era ' (from 1980 to 2014), the 'deep learning era' (from 2015 to 2022), and the 'generative AI era' (from 2023 onwards). Hao et al. compare authors who publish 'AI augmented papers' with those who do not. An 'AI augmented paper' is one that uses methods such as:

...support vector machines and principal component analysis from the machine learning era, and convolutional neural networks and generative adversarial networks from the deep learning era. Large language models, which have emerged in recent years, also rank among the most frequently used methods...

Using a dataset that includes over 27 million papers with complete records that were published between 1980 and 2025, of which about 310,000 were 'AI augmented', Hao et al. find that:

...annual citations to AI papers are 98.70% higher than those to non-AI papers on average...

So, AI augmented research gathers more citations, which suggests that authors using AI in their research achieve greater impact. This is reinforced by evidence that AI augmented papers are published in higher quality journals (with Q1 journals being the highest ranked). Hao et al. report that:

...the proportion of AI papers in Q1 journals is 18.60% higher than that of non-AI papers in all journals; in Q2 journals, the AI proportion is 1.59% higher; whereas Q3 and Q4 journals hold a relatively lower proportion of papers with AI... These results indicate a heterogeneous distribution of AI-augmented papers across journals, with a higher prevalence in high-impact journals.

And AI appears to make authors more productive, as:

On average, researchers adopting AI annually publish 3.02 times more papers... and garner 4.84 times more citations... than those not adopting AI, with consistency.

All of these results seem to hold across all of the disciplines that Hao et al. consider. However, it is not all good news. Hao et al. use machine learning to create a measure of the 'breadth of scholarly attention'. Using that measure, they find that:

Compared with conventional research, AI research is associated with a 4.63% contracted median collective knowledge extent across science, which is consistent across all six disciplines... Moreover, when dividing these disciplines into more than two hundred sub-fields, the contraction of knowledge extent can be observed in more than 70% of them...

Of course, some of the differences here may be due to selection, as the types of researchers, and the types of research, involving AI use may be meaningfully different from those that don't. However, putting the selection issues aside, Hao et al. note that there is a tension between the individual researcher's incentive to produce a greater quantity of research that has higher impact, which would suggest greater use of AI, and the social incentive to produce a greater breadth of research.

So, the takeaway from this paper is that we need to consider researcher incentives, not just productivity. Specifically, this research suggests that the use of AI in research is leading to a 'prisoners' dilemma' outcome: each individual researcher acting in their own best interests (and using AI in their research) leads to an outcome that is worse for society overall (less breadth of research and more incremental gains).

Hao et al. conclude that:

The substantial academic benefits of AI use may be a driving force behind its accelerated rate of adoption; however, we also find unintended consequences from the increased prevalence of AI-augmented research. In all fields, AI-augmented research focuses on a narrower scope of scientific topics and reduces the scientific engagement of follow-on research, leading to more overlapping research work that slows the expansion of knowledge. Further, with a greater concentration of collective attention to the same AI papers, the adoption of AI seems to induce authors to converge on the same solutions to known problems rather than create new ones.

So, what is the solution here? Society probably wants research to be higher quality and have a broad scope. But individual researchers' incentives to use AI in their research appears inconsistent with that outcome. The traditional prisoners' dilemma is a repeated game (see here or here, for example), and the players of that game can avoid the worst outcome by cooperating. In this case, the researchers could cooperate by agreeing not to use AI in their research. The problem is that every researcher has an incentive to cheat on that agreement, since if they use AI, then that will be good for their career. This prisoners' dilemma is more difficult to ensure cooperation in than the traditional game, because there are not just two players who need to cooperate, but thousands (or millions). Ensuring cooperation in a prisoners' dilemma game with many players, each of whom is far better off cheating than cooperating, is almost impossible (which is why solving the problem of climate change is so difficult).

My own view is that the answer is not to keep AI out of research. That is not realistic, in the same way that it's not realistic to expect students not to use generative AI. The incentives need to be redesigned, but this will be no easy task. As long as universities, research funders, and publishers reward researchers for quantity, citations, and publication in top-ranked outlets, then we should expect more AI-augmented work, with a narrower scope than society might prefer. If we want AI to expand knowledge rather than simply accelerate competition within narrow foci, then we need institutions that also reward novelty, breadth, and the discovery of new questions. That is the economic challenge we must face up to.

[HT: Marginal Revolution]

Monday, 23 March 2026

The relationship between obesity of politicians and corruption is correlation, not causation

Not every correlation between two variables represents a causal relationship. Even if we can tell a compelling story about why a change in one variable might cause a change in another, that doesn't make the relationship causal. Sometimes a correlation actually results from something other than the story you tell. Sometimes the correlation is just random noise (a spurious correlation). So, we should be cautious when interpreting correlations.

I was reminded of this when reading this 2021 article by Pavlo Blavatskyy (University of Montpellier), published in the journal Economics of Transition and Institutional Change (sorry, I don't see an ungated version online). The article even generated a small debate, with a comment by György Márk Kis, and then a reply by Blavatskyy, appearing in the same issue of the journal.

In the original article, Blavatskyy looks at the relationship between the body mass index (BMI) of politicians in a country and the Corruption Perceptions Index by Transparency International. The data Blavatskyy uses is for 2017, and the sample of countries is limited to 15 post-Soviet countries (Armenia, Azerbaijan, Belarus, Estonia, Georgia, Kazakhstan, Kyrgyzstan, Latvia, Lithuania, Moldova, Russia, Tajikistan, Turkmenistan, Ukraine, and Uzbekistan). The argument for why this correlation matters is explained in Blavatskyy's reply to Kis:

One common form of corruption/lobbying is inviting governmental officials to lavish banquets with excessive consumption of food and drinks... Corrupt politicians frequenting such banquets might risk gaining extra weight. This ‘hedonic theory of corruption’ postulates the existence of a positive relationship between median body mass index of public officials and the level of grand political corruption in society.

So, Blavatskyy is able to tell a good story for why greater corruption would cause higher BMI among politicians. However, that doesn't mean that the relationship is causal. Even though the correlation between perceived corruption and median politician BMI is clear, from Figure 1 in the original paper:

Low numbers in the Corruption Perceptions Index represent higher levels of perceived corruption. So, this figure shows that countries where the politicians have higher median politician BMI have higher levels of perceived corruption.

Kis took issue with a number of things in the paper. First, why those 15 countries? Why not all countries? Kis shows that if you separate the 15 countries in Blavatskyy's sample by their geographic location, you get different relationships within each subsample. However, the broader question is not what happens when you look at subsamples, but does this relationship hold if you add more countries to the sample? Neither Blavatskyy nor Kis answer that question. We should also wonder whether there is something special about 2017 that leads to this correlation. Does it hold in other years?

In his reply, Blavatskyy doesn't really address those two points (narrow sample, and a single year) in a convincing way. Instead, he narrows the sample even further to look at changes in politician BMI and perceived corruption for just one of the countries in his sample, Ukraine. In that analysis, he again shows a correlation between corruption perceptions and politician BMI, in this case over time for Ukraine. However, that simply raises the question of: why Ukraine? Why didn't he look at other countries in his sample in that way? And just because Ukraine shows a correlation over time, that still doesn't demonstrate a causal relationship.

Kis also takes issue with the machine learning algorithm that Blavatskyy uses to estimate the BMI for politicians in his sample. Kis notes that the accuracy of the algorithm is quite dubious (my words, not Kis's), with:

...errors of at least 5.5 in 21.1% of the time.

That's an error in the estimated BMI of 5.5 in over 20 percent of cases. That extent of measurement error would be problematic. To that, I would add that it is unclear whether the training sample that the machine learning algorithm was trained on included people from post-Soviet countries. The relationship between facial features and BMI could well be ethnic-specific, in ways that systematically bias the results. We have no way of knowing. And Blavatskyy didn't address this point in his reply.

Now, the point of this post is to focus on correlation or causation. From what I have seen, this seems a likely candidate for confounding. There are any number of variables that might increase politician BMI and increase corruption, without corruption being a cause of higher politician BMI. As one example, a country with high inequality might simultaneously have high corruption (with petty officials willing to take bribes to supplement their low incomes) and high politician BMI (since politicians would likely be among the wealthy class in society). Blavatskyy doesn't consider confounding variables such as inequality, or differences in age distribution, or differences in average BMI in the population, or regional differences in diet, in his analysis.

Now, to be fair to Blavatskyy, he doesn't adopt a causal interpretation of his results (except in his response to Kis, as I quoted above). Instead, Blavatskyy argues that, if BMI and perceived corruption are correlated, then we might infer how much corruption is being experienced in a country by looking at the median BMI of its politicians. However, even that inference is problematic, and Blavatskyy should know why. He gives the example of Swiss watches in China as a proxy for corruption, but then notes that:

...the rise of social media and Internet anti-corruption platforms in 2011–2012 made it no longer possible to measure grand political corruption through visible luxury Swiss watches. Luxury Swiss watches could still be a popular expenditure of corrupt governmental officials, but these officials are now more careful not to reveal their Swiss watches to the general public.

When politicians realised that their Swiss watches were giving away their corruption, they stopped showing off their Swiss watches. If politicians realised that their expanding waistlines were giving away their corruption, wouldn't they invest more in personal trainers (or liposuction)? As soon as this correlation was used for inference, the correlation would likely start to break down. This again illustrates the limited usefulness of such proxies.

Correlation does not imply causation. And sometimes, correlation today does not imply correlation in the future. We need to be much more cautious when considering analyses like this one.

Sunday, 22 March 2026

The impact of Taylor Swift on the Kansas City Chiefs' TV ratings

In 2023, Taylor Swift began a relationship with Kansas City Chiefs tight end Travis Kelce. After that, Kansas City Chiefs broadcasts seemed increasingly eager to cut to shots of Taylor Swift in the corporate boxes, rather than fans in the stands. The NFL was clearly trying to appeal to Swift's fans, but did it work? In a new article published in the Journal of Sports Economics (sorry, I don't see an ungated version online), Kerianne Rubenstein (Syracuse University) and Frank Stephenson (Berry College) show that it did.

Rubenstein and Stephenson collated data on 247 NFL games played in the 2022 and 2023 seasons, noting that the first Chiefs game that Taylor Swift attended was in the third week of the 2023 season. They apply a difference-in-differences analysis, comparing the difference in TV ratings between before and after Week 3 of 2023 for the Chiefs, with before and after Week 3 of 2023 for other teams, while controlling for other variables expected to affect TV ratings. In other words, Rubenstein and Stephenson check whether the Chiefs' TV ratings increased by more than the average before-and-after change that other teams experienced. They find that:

...Chiefs’ games after Taylor Swift started attending see an increase of 2.15 ratings points, which is an approximately 32% increase relative to the mean Nielsen rating... total viewership increased by about 4.8 million after Taylor Swift started attending Chiefs’ games.

So, it appears that Taylor Swift did increase TV ratings for the Kansas City Chiefs. Good news for the Chiefs (and for other NFL teams, who share in the broadcast revenue). Interestingly, and to be expected given Swift's young fan base, the effect was even larger on TV viewership among those aged 18-34 years, with a 40.1 percent increase in TV rating.

An important question, though, is whether Swift attracted new fans, and whether they stuck around. In terms of the former, Rubenstein and Stephenson find some evidence that games played at the same time as Chiefs games suffered a decrease in TV ratings (although that analysis is based on a sample of only ten games, which limits how much we can take from it). However, they also find an increase in TV ratings when the Chiefs game was the only game in its timeslot. So, while there was some substitution between NFL games, new fans were also attracted to watch. And, they did stick around - Rubenstein and Stephenson find limited evidence that the effect declined over time, with Chiefs games later in 2023 having a similar TV rating as those earlier in the season (it is worth noting that the Chiefs had a particularly good 2023 season though, finishing the regular season 11-6, winning their division, and ultimately winning Super Bowl LVIII).

Celebrities are a common feature of sports games. Rubenstein and Stephenson note the example of the Atlanta Hawks, who make courtside seats available to celebrities with large social media followings in the hopes of increasing game attendance and TV ratings. Not every celebrity has the profile of Taylor Swift. However, the results in this study suggest that the Hawks' strategy might be a sensible strategy for increasing the profile of games. The NFL should take notice. Certainly, this would make much more sense than, as some conspiracy theorists would have you believe, biasing the officiating in favour of particular teams (like the Chiefs). So, leaving conspiracies aside, what we learn from this paper is that celebrity appearances at games can increase demand. That seems to be exactly what happened here, with Taylor Swift’s presence helping to increase the audience for Kansas City Chiefs games.

Friday, 20 March 2026

This week in research #118

Here's what caught my eye in research over the past week (a quiet week, following last week's bumper edition):

  • Rubenstein and Stephenson assess the effect of Taylor Swift’s relationship with Travis Kelce on the Kansas City Chiefs’ television audience, and find that viewership increases by about one-third beginning with Swift’s first time attending a Chiefs’ game
  • Bussoli and Fattobene (open access) find that Financial Graph Literacy is lower among older adults, those with less education, and lower-income groups, and is significantly associated with a greater likelihood of engaging in proactive financial behaviours such as saving, investing, budgeting, and using digital financial tools