Wednesday, 18 December 2019

Randomising research publications

If you've ever had the misfortune of being drawn into a conversation with me about research funding, then you will have heard my view that, after an initial cull of low-quality research funding applications, all remaining applications should be assigned ping-pong balls, which are then drawn randomly from a bin to allocate the available funding. You could even make a big event of it - the researchers' equivalent of a live lotto draw.

In fact, as this Nature article notes, this approach has begun to be adopted, including in New Zealand:
Albert Einstein famously insisted that God does not play dice. But the Health Research Council of New Zealand does. The agency is one of a growing number of funders that award grants partly through random selection. Earlier this year, for example, David Ackerley, a biologist at Victoria University of Wellington, received NZ$150,000 (US$96,000) to develop new ways to eliminate cells — after his number came up in the council’s annual lottery.
“We didn’t think the traditional process was appropriate,” says Lucy Pomeroy, the senior research investment manager for the fund, which began its lottery in 2015. The council was launching a new type of grant, she says, which aimed to fund transformative research, so wanted to try something new to encourage fresh ideas...
...supporters of the approach argued that blind chance should have a greater role in the scientific system. And they have more than just grant applications in their sights. They say lotteries could be used to help select which papers to publish — and even which candidates to appoint to academic jobs.
The latest issue of the journal Research Policy has an interesting article by Margit Osterloh (University of Zurich) and Bruno Frey (University of Basel), which argues for randomisation of the selection of which papers to publish (it seems to be open access, but just in case here is an ungated version). Their argument relies on the fact the Journal Impact Factors (JIFs) are a poor measure of the quality of individual research, and yet a lot of research is evaluated in terms of the impact factor of the journal in which it is published. Moreover, they note that:
...many articles whose frequency of citation is high were published in less well-ranked journals, and vice versa... Therefore, it is highly problematic to equate publication in “good” academic journals with “good” research and to consider publication in low-ranked journals automatically as signifying less good research.
Despite this problem with impact factors, they continue to be used. Osterloh and Frey argue that this is because the incentives are wrong. Researchers who publish in a high impact factor journal are benefiting from 'borrowed plumes', because the journal impact factor is largely driven by a small number of highly cited papers:
It is exactly the skewed distribution of citations that is beneficial for many authors. As argued, the quality of two thirds to three quarters of all articles is overestimated if they are evaluated according to the impact factor of the journal in which they were published. Thus, a majority of authors in a good journal can claim to have published well even if their work has been cited little. They are able to adorn themselves with borrowed plumes...
Osterloh and Frey present three alternatives to the current journal publication system, before presenting their own fourth alternative:
When reviewers agree on the excellent quality of a paper, it should be accepted, preferably on an “as is” basis (Tsang and Frey, 2007). Papers perceived unanimously as valueless are rejected immediately. Papers that are evaluated differently by the referees are randomized. Empirical research has found reviewers´ evaluations to be more congruent with poor contributions (Cicchetti, 1991; Bornmann, 2011; Moed, 2007; Siler et al., 2015) and fairly effective in identifying extremely strong contributions (Li and Agha, 2015). However, reviewers’ ability to predict the future impact of contributions has been shown to be particularly limited in the middle range in which reviewers´ judgements conform to a low degree (Fang et al., 2016).25 Such papers could undergo a random draw.
In other words, the best papers are accepted immediately, the worst papers are rejected immediately, and the papers where the reviewers disagree are accepted (or rejected) at random. Notice the similarity to my proposal for research grant funding at the start of this post.

The journal issue that the Osterloh and Frey article was published in also has three comments on the article. The first comment is by Ohid Yaqub (University of Sussex), who notes a number of unresolved questions about the proposal, and essentially argues for more research before any radical proposal to shake up the journal publication system is implemented:
Randomisation in the face of JIF may carry unintended consequences and may not succeed in dislodging the desire for journal rankings by some other measure(s). It should wait until we have more widely appreciated theory on peer review and citation, more inclusive governing bodies that can wield some influence over rankings and their users, and a stronger appetite for investing in evaluation.
The second comment is by Steven Wooding (University of Cambridge), who also argues that impact factors are a heuristic (a rule of thumb) used for judging research quality. Like Yaqub, he argues for more evidence, but in his case he argues for more evidence on why people are using journal impact factors and testing and evaluating the alternatives:
If you want people to stop using a heuristic, you need to ask what they are using it for, why they are using it, and to understand what their other options are. We agree that JIF use needs to be curbed. Our difference of opinions about trends in JIF use and the best way to reduce it should be settled by good evidence on whether its use is increasing or falling; where and why JIF is still used; and by testing and evaluating different approaches to curb JIF use.
The third comment is by Andrew Oswald (University of Warwick), who presents a mathematical case in favour of randomisation. Oswald shows that, if the distribution of quality research papers is convex (as would be the case if there are a few blockbuster papers, and many papers of marginal research significance), then randomisation is preferable:
Consider unconventional hard-to-evaluate papers that will eventually turn out to be either (i) intellectual breakthroughs or (ii) valueless. If the path-breaking papers are many times more valuable than the poor papers are valueless, then averaging across them will lead to a net gain for society. The plusses can be so large that the losses do not matter. This is a kind of convexity (of scientific influence). Averaging across the two kinds of papers, by drawing them randomly from a journal editor's statistical urn, can then be optimal.
Finally, Osterloh and Frey offer a response to the comments of the other three authors (at least for now, all of the comments and the response are open access).

A huge amount of under-recognised time and effort is spent on the research quality system. Journal reviewers are typically unpaid, as often are reviewers of research grants and those on appointments committees. However, there is a large opportunity cost of their time spent in reviewing manuscripts or applications. It isn't clear that the benefits of this effort outweigh the costs. As Osterloh and Frey note, there seems little correlation between the comments and ratings of different reviewers. Every academic researcher can relate stories about occasions where they have clearly lost in the 'reviewer lottery'. In the face of these issues, it is time to reconsider key aspects of the way that research is funded and published, and the way that academic appointments are made. If executed well, an approach based on randomisation would save significantly on cost, while potentially increasing the benefits of ensuring that high quality research is funded and published, and that high quality applicants are not overlooked.

[HT: Marginal Revolution for the Nature article]

2 comments:

  1. Just asp there is too much emphasis on JIF for journal publications there is too much emphasis in grant applications on the researcher's past pubs. It's better to evaluate on the ideas/future than to evaluate it on past success.

    ReplyDelete
    Replies
    1. I'm totally with you on that. Past publications, and whether they have had grants in the past, are drivers of future grant application success.

      Delete