Friday, 11 March 2022

The effects of training policy makers in econometrics

David Card, Joshua Angrist, and Guido Imbens shared the Nobel Prize in economics last year, for their contributions to the 'credibility revolution' in economics. The revolution involved the adoption of a range of new econometric methods designed to extra causal estimates, and set a much higher standard of what constitutes sound evidence for policy making. However, policy makers have not necessarily caught up. It seems that there could be substantial gains in improved policy to be achieved from training policy makers to use the insights from the credibility revolution.

That is essentially what this 2021 paper by Sultan Mehmood (New Economic School), Shaheen Naseer (Lahore School of Economics) and Daniel Chen (Toulouse School of Economics) sets out to investigate. Mehmood et al. conducted a thorough randomised controlled trial involving deputy ministers in Pakistan. As they explain:

We conducted a randomized evaluation implemented through close collaboration with an elite training academy. The Academy in Pakistan is one of the most prestigious training facilities that prepares top brass policymakers—deputy ministers—for their jobs. These high-ranking policy officials are selected through a highly competitive exam: about 200 are chosen among 15,000 test-takers annually.

There are a lot of moving parts to this research, so it is difficult to summarise (but I'm going to try!). First, Mehmood et al.:

...conducted a baseline survey and asked the participants to choose one of two books (1) Mastering ’Metrics: The Path from Cause to Effect by Joshua Angrist and Jörn-Steffen Pischke or (2) Mindsight: The New Science of Personal Transformation by Daniel J. Siegel...

Actually, they asked the deputy ministers to choose a high or low probability of receiving each of the two books, and then (importantly) they randomised which book each deputy minister actually received (the randomisation is important, and a point we will return to later). The book Mastering 'Metrics (which I reviewed here) is important, because it is the effect of assigning that book that Mehmood et al. set out to test. Mastering 'Metrics is essentially an exposition of the methods that constitute the credibility revolution, and it presents the randomised controlled trial (RCT) as the 'experimental ideal'. However, the treatment doesn't stop only with the assignment of a book to read:

The meat of our intervention is intensive training where we aim to maximize the comprehension, retention, and utilization of the educational materials. Namely, we augmented the book receipt with lectures from the books’ authors, namely, Joshua Angrist and Daniel Siegel, along with high-stakes writing assignments... As part of the training program, deputy ministers were assigned to write two essays. The first essay was to summarize every chapter of their assigned book, while the second essay involved discussing how the materials would apply to their career. The essays were graded and rated in a competitive manner. Writers of the top essays were given monetary vouchers and received peer recognition by their colleagues (via commemorative shields, a presentation and discussion of their essays in a workshop within the treatment arm). Deputy ministers in each treatment group also participated in a zoom session to present, discuss the lessons and applications of their assigned book in a structured discussion.

Performance in the training programme is highly incentivised, not only because of the rewards on offer, but because the grades matter for the future career progress of each deputy minister. So, they had a strong incentive to participate fully. 

Mehmood et al. then test the effect of being assigned the Mastering 'Metrics treatment on a range of outcomes measured four to six months after the workshop, finding that:

While attitudes on importance of qualitative evidence are unaffected, treated individuals' beliefs about the importance of quantitative evidence in making policy decisions increases from 35% after reading the book and completing the writing assignment and grows to 50% after attending the lecture, presenting, discussing and participating in the workshop. We also find that deputy ministers randomly assigned to causal training have higher perceived value of causal inference, quantitative data, and randomized control trials. Metrics training increases how policymakers rate the importance of quantitative evidence in policymaking by about 1 full standard deviation... When asked what actions to undertake before rolling out a new policy, they were more likely to choose to run a randomized trial, with an effect size of 0.33 sigma after completing the book and writing assignment (partial training) and 0.44 sigma after attending the lecture, presentation, discussion and workshop (full training). We also observe substantial performance improvements in scores on national research methods and public policy assessments.

Mehmood et al. also conducted a field experiment to evaluate how much the deputy ministers would be willing to pay for different types of evidence (RCTs, correlational data, and expert bureaucrat advice), and find that:

...treated deputy ministers were much more willing to spend out of pocket (50% more) and from public funds (300% more) for RCTs and less willing to pay for correlation data (50% less). Demand for senior bureaucrats’ advice is unaffected.

Mehmood et al. then ran a second field experiment, where:

First, we elicited initial beliefs about the efficacy of deworming on long-run labor market outcomes. Then, they were asked to choose between implementing a deworming policy versus a policy to build computer labs in schools... Next, we provided a signal - a summary of a recently published randomized evaluation on the long-run impacts of deworming... After this signal, we asked the same deputy ministers about their post-signal beliefs and to make the policy choice again.

Mehmood et al. find substantial effects:

From this experiment, we observe that only those assigned to receive training in causal thinking showed a shift in their beliefs about the efficacy of deworming: the treated ministers became more likely to choose deworming as a policy after receiving the RCT evidence signal. The magnitudes are substantial - trained deputy ministers doubled the likelihood to choose deworming, from 40% to 80%. Notably, this shift occurs only for those ministers whose previously believed the impacts of deworming were lower than the effects found in the RCT study, while those who previously believed the effects were larger than the estimate reported in the signal did not shift their choice of policy.

Importantly, experimenter demand effects are likely to be limited, because as part of the experiment they recommended that the deputy ministers should choose the policy to build computer labs. If anything, these effects are going to be underestimates.

Next, Mehmood et al. test the effects on prosocial behaviour, which might be a concern if you think that studying economics makes students less ethical (see here, or here, or here). On this point:

The administrative data also included a suite of behavioral data in the field, for example, a choice of field visits to orphanages and volunteering in low-income schools. This allowed us to assess potential crowdout of prosociality, an oft-raised concern about the teaching of neoclassical economics... We detected no evidence of econometrics training crowding out prosocial behavior - orphanage field visits, volunteering in low-income schools and language associated with compassion, kindness and social cohesion is not significantly impacted. Scores on teamwork assessments as a proxy of soft skills were also unaffected...

Finally, the randomisation of deputy ministers to books was important, as that provides an indication of the initial preferences for each book. Ministers who preferred the 'Mastering Metrics book could differ in meaningful ways from those who preferred Mindsight, and that might affect the estimated impact of the treatment. Instead, Mehmood et al. note that:

A typical concern in RCTs is that the compliers respond to treatment and we estimate Local Average Treatment Effect (LATE) since we do not observe defiers. It is a plausible concern that people who demand to learn causal thinking may be more responsive to the treatment assignment. Thus estimates of the treatment impacts would be uninformative on those who are potential non-compliers. In our unique experimental set-up, we developed a proxy for compliers through those who demanded the metrics book; we show that the effects are the same for both the high and low demanders... we observe no significant differences between the treatment effects for low and high demanders of metrics training.

There is a huge amount of additional detail in the paper. The overall takeaway is that understanding the importance of causal estimates makes a significant difference to the preferences and decision-making of policy makers, and therefore can contribute to better decision-making. We have these results for Pakistan, but it would be interesting to see if they hold in other contexts. And if they do, then the training of policy-makers should include training in basic econometrics.

[HT: Markus Goldstein at the Development Impact blog]

No comments:

Post a Comment