Wednesday, 7 January 2026

Lessons from Joshua Gans on AI for economics research

I've been increasingly using generative AI (specifically, ChatGPT) to assist with research. I've been quite cautious though, worrying a lot about the quality of AI output, although my worries reduced substantially once ChatGPT started linking to its sources. However, I know many other researchers are using generative AI far more extensively and directly in their research than I am. My approach continues to be to use ChatGPT as an enthusiastic, but not fully polished, research assistant. Given my experience so far, I was interested to read the reflections of Joshua Gans on his year of using generative AI for economics research. His approach was:

I had lots of ideas for papers that I hadn’t developed, so I decided to spend the year working my way down the list. I would also add new ideas as they came to me. My proposed workflow was all about speed. Get papers done and out the door as quickly as possible, where a paper would only be released if I decided I was “satisfied” with the output. So it cut any peer reviews or discussions out during the process of generating research quickly, but I would send those papers to journals for validation. If I produced a paper that I didn’t think could be published (or shouldn’t be), then I would discard it. There were many such papers.

Like Gans, I have a lot of research ideas, and not enough time to pursue all of them. Many of my ideas would go nowhere, even if I did have time to pursue them. But for some others, I have later read research papers that have done something I had thought of earlier but not had time to do myself. There are therefore a lot of missed opportunities, because it isn't possible to perfectly identify the good ideas in advance - you need to try them out before you realise that they are uninteresting or dead ends. Being able to try out more ideas seems like a good thing.

The opportunity cost of spending time pursuing one research idea is not pursuing other ideas. If generative AI allows us to pursue research ideas in less time, then it lowers the opportunity cost of pursuing those ideas. However, as Gans notes:

When you lower the cost of doing something, you do more of it. Normally, the decision whether to continue or abandon a project gives rise to some introspection (or rationalisation) of whether continuing is worthwhile relative to the costs. When the going gets tough, you drop ideas that don’t look as great.

The issue with an AI-first approach is that its benefit, reducing the toughness of going, is also its weak point; you don’t face those decision points of continuing/abandoning as often. That means that you are more likely to end up completing a project. But this lack of decision points means that you end up pursuing more lower-quality ideas to fruition than you would otherwise.

In Gans's experience, AI increases research output, but it also weakens the stopping rule that would otherwise kill bad ideas early, by decreasing the marginal cost of continuing the research. When the marginal cost of continuing is lower, we spend more time on each bad idea before discarding it. And if we spend too long on bad ideas, the opportunity cost of pursuing an idea increases - time spent working on bad ideas is time not spent pursuing ideas that turn out to be better. As a result, the average quality of our research may decrease. That risk needs more careful consideration.

Although he doesn't note the potential increasing opportunity cost, Gans's conclusion seems to point in that direction:

My point is that the experiment — can we do research at high speed without much human input — was a failure. And it wasn’t just a failure because LLMs aren’t yet good enough. I think that even if LLMs improve greatly, the human taste or judgment in research is still incredibly important, and I saw nothing over the course of the year to suggest that LLMs were able to encroach on that advantage. They could be of great help and certainly make research a ton more fun, but there is something in the judgment that comes from research experience, the judgment of my peers and the importance of letting research gestate that seems more immutable to me than ever.

Generative AI has the potential to increase the quality, and the quantity, of research. Gans seems to have seen it in his work, and I've seen it already in my own work too. In fact, my experience so far has been that careful use of generative AI (for example, checking for literature gaps, or exploring econometric methods and robustness checks) has reduced the time wasted on fruitless research that would have gone nowhere. However, it is possible to use too much generative AI in research, just as it is possible to use too little. There is a middle ground, and Gans seems to be finding it from one direction (starting from over-using generative AI), and maybe I am finding it from the other (starting from under-using generative AI). The important thing seems to be ensuring that a human is kept in the loop (as Ethan Mollick noted in his book Co-Intelligence, which I reviewed here). Specifically, we can use generative AI for its strengths (in testing our initial ideas, mapping the literature, exploring alternative methods, or stress-testing assumptions). And we can keep the human in the loop by pausing the research at more key points to consider where it has gotten to and check our intuition, as well as continuing to seek peer review of the draft end-product.

So, Gans might be holding back on the generative AI this year, but I'll be further expanding my use. Starting with something related to this post: writing up some research on using AI tutors in teaching first-year economics, which is research I presented in a brown-bag seminar at Waikato last month (and I will have more on that in a future post).

[HT: Marginal Revolution]

No comments:

Post a Comment