Thursday, 18 August 2022

Curb your enthusiasm for randomised controlled trials

I've often referred to randomised controlled trials (RCTs) as the gold standard for microeconomic research. That's because, when you randomise which units are treated, you can generally extract an unbiased estimate of the causal effect of some intervention or other change that you are trying to evaluate. Other methods of causal inference (that I often refer to on this blog, like instrumental variables, or regression discontinuity) are using careful research designs to try and mimic what you get from an RCT. 

However, despite the boosterism of high-profile economists like Andrew Leigh (whose book, Randomistas, I reviewed here) and Nobel Prize winners Abhijit Banerjee and Esther Duflo (whose book, s Poor Economics, I reviewed here), calling RCTs the gold standard is not without controversy. There are a number of RCT sceptics, one of which is Martin Ravallion (Georgetown University, and previously with the World Bank). I just finished reading this 2018 CGD Working Paper by Ravallion, where he outlines his argument against RCTs (in the context of development, which is the context in which economists most commonly employ RCTs, although they are increasingly common across applied microeconomics). The working paper isn't as anti-RCTs as other critiques, and I think it would be fair to say that Ravallion would simply prefer that people don't overstate what RCTs are capable of, and that researchers wouldn't use them to the exclusion of all alternative methods. As he writes in the conclusion:

We are seeing a welcome shift toward a culture of experimentation in fighting poverty, and addressing other development challenges. RCTs have a place on the menu of tools for this purpose. However, they do not deserve the special status that advocates have given them, and which has so influenced researchers, development agencies, donors and the development community as a whole. To justify a confident ranking of two evaluation designs, we need to know a lot more than the fact that only one of them uses randomization.

The claimed hierarchy of methods, with randomized assignment being deemed inherently superior to observational studies, does not survive close scrutiny...

The questionable claims made about the superiority of RCTs as the “gold standard” have had a distorting influence on the use of impact evaluations to inform development policymaking, given that randomization is only feasible for a non-random subset of policies... The tool is only well suited to a rather narrow range of development policies, and even then it will not address many of the questions that policymakers ask. Advocating RCTs as the best, or even only, scientific method for impact evaluation risks distorting our knowledge base for fighting poverty.

I think the critique is quite measured and reasonably persuasive. However, I doubt it will cause the 'randomistas' to re-evaluate their approach. At best, it might give pause to the broader development community, to think about which contexts are best suited to RCTs, and to adopt other methods in situations where RCTs are less appropriate. In that vein, Ravallion highlights the ethical issues associated with RCTs, which possibly haven't gotten the attention they deserve:

RCTs are also ethically contestable in a way that experimentation using observational studies is not. The ethical case against RCTs cannot be judged properly without assessing the expected benefits from new knowledge, given what is already known. Review boards need to give more attention to the ex-ante case for deliberately withholding an intervention from those who need it, and deliberately giving it to some who do not, for the purpose of learning.

On that last point, it is likely that there are a lot of RCTs undertaken that are, at best, unnecessary, and at worst, possibly unethical. That would be the case where the researchers are already pretty sure that an intervention has a positive effect, and the RCT is being employed to see how big the effect is, rather than whether there is any effect at all. In that case, withholding the intervention from the control group is exposing them to worse outcomes (however measured), simply for research purposes. On the other hand, there are genuine cases where an intervention can't be rolled out to everyone who is eligible, because of resource constraints. That might overcome the ethical issues to some extent. These are some curly questions there that ethics committees should be paying more attention to.

Ravallion hasn't changed my mind about the 'gold standard' nature of RCTs. I already recognised that there were many contexts where RCTs are impractical and other methods should be employed, or where randomisation cannot be strictly adhered to. In those contexts, a mix of alternative methods are available. Your views may well be different. Regardless, the points that Ravallion raises should temper any enthusiasm for RCTs.

No comments:

Post a Comment