Many claims have been made that higher inequality causes worse health outcomes. In fact, that is one of the central claims in the Wilkinson and Pickett book The Spirit Level (which I reviewed here). Of course, as I noted in my review, the relationships that Wilkinson and Pickett establish are correlations, not causal relationships. However, if we put that aside and accept that inequality worsens health outcomes, then we should expect there to be a robust association between inequality and life expectancy. In places with higher inequality, life expectancy should be lower ceteris paribus (holding all else equal). And looking at a single place, when inequality is lower, life expectancy should be higher ceteris paribus.
However, the literature is divided on whether such relationships exist. So, I was interested to read this recent article by Lisa Martin (University of Oxford) and Joerg Baten (University of Tübingen), published in the Journal of Economic Behavior and Organization (sorry, I don't see an ungated version online). Establishing a relationship between inequality and life expectancy requires variation in both variables. That is somewhat difficult to establish when income and life expectancy data are only available at the country level for a small number of years for many countries, or for a few countries (all of which are developed countries) over a longer time period.
Martin and Baten avoid this problem by using data on height and height inequality to fill in gaps in the data. These variables are available for a broader range of countries, including developing countries. Their data come from the Clio Infra project, which has data on a range of economic indicators for many countries going back to the early 1800s (and in some cases going back to the 1500s). The data that Martin and Baten use covers the period from 1820 to 2000, and limited to countries in Asia and Africa.
Martin and Baten use data on height to estimate life expectancy, which relies on a fairly simple regression model that includes height and regional dummy variables. They then use data on the coefficient of variation in height to estimate the Gini coefficient measure of income inequality. Both estimates appear to do a reasonable job. Finally, Martin and Baten use the estimated life expectancy and income inequality variables, and look at the relationship between them, controlling for:
...(1) existence of a health insurance system, (2) wars, (3) the pandemic decade of the 1910s–flu, (4) malaria-intensive countries...
In their simplest model, which doesn't include the control variables (only time fixed effects), they find that:
The coefficient of inequality is estimated at approximately -0.23 and statistically significant. It implies that an increase in the Gini coefficient by one index point translates approximately into a two and a half month decrease in estimated life expectancy.
And when the controls are included:
...we again observe a statistically significant negative coefficient for income inequality, though it is smaller than the significant estimates in [the simplest specification]...
Martin and Baten recognise that these results don't establish a causal relationship (which is a problem across this literature), so they then employ an instrumental variables approach. Their instrument of choice is:
...the ratio of the share of the land suitable for the cultivation of the “inequality crop” (sugar) to the share of the land suitable for the cultivation of the “equality crop” (wheat).
They justify this as follows:
A sugar plantation is a clear example of an agricultural production type of large-scale economies... On the other hand, wheat production is already highly productive on much smaller farm units, as has been amply demonstrated in the agricultural economics literature. The specialization of a country on the cultivation of large-scale cash crops is positively associated with inequality, whereas food crops such as wheat are not scale-intensive and were historically planted in smallholdings.
That passes the smell test that this ratio could be a useful instrument for inequality, although whether the instrument affects life expectancy only through its effect on inequality is arguable (this exclusion restriction is a requirement for a valid instrument, but it cannot be tested for). Anyway, in this IV analysis, Martin and Baten find that:
...the significant impact of inequality remains a consistent determinant of life expectancy.
Overall, this study seems to support a causal relationship between income inequality and life expectancy. In other words, lower inequality causes higher life expectancy. Martin and Baten aren't able to test the mechanisms that might explain this causal relationship explicitly, but nevertheless they conclude that:
...all these factors were at work for our sample of Africa and Asia in the last two centuries - a public goods effect that we can separate out with the health insurance system variable, a correlation with income (or poverty) and a psychosocial effect of less healthy behavior in more unequal societies.
However, before we simply accept these results at face value, we need to recognise that all of this is based on estimates of life expectancy and income inequality that are estimated from other regression models. We should be somewhat cautious about results from models that use derived variables, or as in this case (I think), where only some of the data come from derived values (while the rest of the data are 'real'). This sort of approach imposes an additional structure on the data used in the model that does not exist in the real data, and could lead to biased results.
So, the results are interesting, and consistent with the negative correlation between income inequality and life expectancy established in other studies. However, these results are not definitive, and the question of whether the relationship is truly causal remains somewhat open.