Sunday 2 October 2022

Good reason to avoid mediation analysis

Following on from yesterday's post on the problems with instrumental variables analysis, I read this post by Uri Simonsohn on the Datacolada blog about mediation analysis. Mediation analysis has always struck me as somewhat odd, and it isn't an approach that is common in economics. And fortunately so, as Simonsohn points out that the problems with mediation analysis are actually quite serious:

In mediation analysis one asks by what channel did a randomly assigned manipulation work. For example, suppose that an experiment finds that randomly assigning Calculus 101 students to have quizzes every week (X) increased their final exam grade (Y).  Mediation analysis is used to test whether this happened because quizzes led students to study more hours through the semester (M). Mediation is present if the estimated effect of X gets smaller when controlling for M...

The problem of interest to this post is that if there is any variable, besides X, that correlates with M and Y (a very likely scenario), mediation is invalid.

Notice the similarity to yesterday's post about instrumental variables analysis. However, instrumental variables analysis might still be valid in many cases, but it requires a strong theoretical basis for the exogeneity of the instrument. For mediation analysis, this problem is probably fatal for almost all applications. Simonsohn provides a very clear explanation of why, and concludes:

In general, if we do mediation analysis, it means we expect X to lead M and Y to be correlated in our experiment. If we expect that, we should expect that other factors, confounds, cause M and Y to be correlated outside our experiment.

This post explains why such correlation invalidates mediation. In other words, this post explains why, in general, we should expect mediation to be invalid.

Simonsohn also provides some good references that provide further support for the problems with mediation analysis (along with an interesting reading that strongly critiques path analysis more generally, which I will certainly follow up on in a future post). It is certainly clear (if it wasn't already) that mediation analysis should be left out of the regular statistical toolbox.

[HT: David McKenzie on the Development Impact blog]

No comments:

Post a Comment