Sex, Drugs and Economics: This research doesn’t convincingly show that biodiversity is good for business

I was interested to read this article in The Conversation last month by Paul Griffin (University of California, Davis) and Martien Lubberink (Victoria University of Wellington), mainly because of statements like this:

...firms operating in areas with richer biodiversity are measurably more productive.

I thought, that's interesting. This might be a good example to use in class next trimester to illustrate the difference between correlation and causation. After all, the authors may be correct that firms operating in areas with richer biodiversity are more productive (correlation), but that doesn't mean that biodiversity increases productivity (causation).

And then I read the paper that The Conversation article was based on. And at that point, I decided that I shouldn't use this as an example of the difference between correlation and causation, because even the correlations that they find are shaky at best.

The approach that Griffin and Lubberink take is to look at the relationship between measures of business output and measures of biodiversity. Their measure of business output is sales or gross profit, taken from Stats NZ's Longitudinal Business Database. They generally interpret this as a measure of productivity. And that is the first problem with the paper. Sales can be interpreted as gross revenue, and in some contexts sales may be used as a rough measure of gross output. But sales are not a good measure of productivity, and is not a good measure of the economic value created by a business. The more appropriate measure would be value added, or at least something closer to profit. To see why, consider two firms that both produce a product that sells for $1,000 per unit, and both firms sell 1,000 units per month. Both firms have sales of $1 million per month. Firm A buys the product wholesale at a cost of $800 per unit, then adds a mark-up. The value added of Firm A is $200,000 per month. Firm B buys raw materials of $200 per unit, adds labour of $300 per unit, and then sells the product. The value added of Firm B is $500,000 per month. Firm B creates a lot more economic value than Firm A, and yet measured by sales they are the same. Sales are therefore a poor measure of productivity. Gross profit is less problematic, because it subtracts at least some intermediate input costs, but even gross profit is not a pure measure of value added or productivity.

As a measure of biodiversity, or more accurately as a set of proxies for biodiversity-related conditions and pressures, Griffin and Lubberink use a variety of indicators that they call 'biodiversity abundance markers' (which for some reason they use the acronym BDAs to represent). They aggregate data from a range of sources for their various BDAs (which I will discuss in further detail below), with the data at the SA2 level (SA2s are geographical areas approximately the size of suburbs in urban areas, and larger in rural or remote areas). They note that:

For each SA2, we define a vector of “biodiversity abundance markers” (BDAs), where each ranges from 0 to 100. We denote these ranks as BDA1, BDA2, … , BDAm. We then assign them to an SA2 and, therefore, to the businesses and employees in the same SA2. For a given BDA in an SA2, BDAm = 0 means complete biodiversity loss (high pressure from biodiversity loss) for marker m. BDAm = 100 (low pressure from biodiversity loss) is equivalent to an SA2 with an undisturbed or fully intact natural state.

So far, so good. The only issue with that approach is that the measures of biodiversity don't have a natural interpretation, because they are just an index. But we often work with indices - you just need to be cautious about how you interpret the magnitude of the effects. Griffin and Lubberink start by showing the correlation between each of their BDA measures and their measures of business output.

However, then they want to create an overall index of biodiversity, and to do this they:

...multiply each BDA by its SA2 land area and denote the result as an empirical proxy for the natural capital (n) of an SA2 applicable to the businesses operating therein.

Remember that the BDA is an index, bounded between 0 and 100, and it has no natural interpretation in terms of magnitude. So, multiplying the index by the land area of the SA2 is not meaningful, because the BDA is not a measured biodiversity stock per square kilometre. I guess it might make sense if you wanted to calculate a weighted average index, where the weights are based on SA2 land areas, but that isn't what Griffin and Lubberink are doing. Their approach is problematic because it mechanically causes the measured biodiversity to be higher in rural areas ceteris paribus (holding all else equal), where SA2s are larger, and lower in urban areas, where SA2s are smaller. Within urban areas, ceteris paribus it causes higher measured biodiversity in industrial and commercial areas, where SA2s are larger, and lower in residential areas, where SA2s are smaller.

Griffin and Lubberink then aggregate their index-multiplied-by-land-area measures in various ways. The aggregation approach they adopt is fine, but when you aggregate numbers that are not individually meaningful, the result is not meaningful either.

But let's take a step back, because there is another problem. Griffin and Lubberink pitch their analysis as based on a Cobb-Douglas production function. That is fine - a Cobb-Douglas function is a way of relating inputs to output. We already know that their measure of output is faulty. Their inputs are also faulty. Their three-factor Cobb-Douglas function includes inputs of financial capital, human capital, and natural capital.

Griffin and Lubberink measure human capital as the number of employees working in business units in an SA2. That is really a measure of labour input, not human capital. To measure human capital (as well as labour), it would be better to also consider the education level of those employees, since more educated (not to mention more experienced) employees have more human capital. So, their measure is unlikely to pick up the important variation in human capital across SA2s, but it will pick up differences in labour input. But as a measure of combined labour and human capital, their measure will bias downwards measured human capital in urban areas, where education levels are highest, and bias upwards measured human capital in rural and remote areas, where education levels are lowest.

Griffin and Lubberink measure financial capital by the number of business units operating in an SA2. That is not financial capital. That is business density. The relationship between the number of firms and financial capital is not straightforward. An SA2 might have lots of small firms that have low aggregate financial capital, or one large firm that has a lot of financial capital.

Finally, we come back to natural capital, which is measured as noted above. However, some of the measures of biodiversity that Griffin and Lubberink use are better suited than others as a measure of natural capital. The definition of capital is important here - capital is stored up resources that can be used to produce things. Financial capital is stored up savings that can be used in the future. Human capital is stored up education and experience that can be used in the future. So, capital is a stock. It is not a flow.

Now, let's consider the BDA measures one-by one. The first (BDA1 - Land Use) is "1 - the ratio of the number of agriculture and forestry business (primary industry) units in an SA2 to the total number of business units in an SA2". This is not really a measure of land use, because it isn't measured in terms of land. The relative size of the businesses is not taken into account, so many small farms would increase this measure compared to fewer large farms. It is also difficult to see how this is a measure of biodiversity.

The second measure (BDA2 - Infrastructure) is "1 - the rank of the number of business units in an SA2 to the land area in km2 of an SA2 divided by the total number of SA2 observations". It is difficult to understand why this BDA is measured as a rank, whereas BDA1 was not. It is also difficult to see how the number of firms is a measure of infrastructure, or how it relates to biodiversity. This measure will tend to be lower in urban areas, where many small businesses are clustered, than in rural areas. So, this is likely just a measure of urbanicity, not a measure of infrastructure or biodiversity.

The third measure (BDA3 - Mining) is "1 - ratio of the number of mining business units in an SA2 to the total number of business units in an SA2", Like BDA1, this doesn't account for the size of the mines. If you have a small quarry, that counts the same in this measure as the enormous Martha Mine in Waihi. It is more plausibly a measure of (negative) biodiversity than the other measures though. Or at least it would be, if the size of the businesses were taken into account.

The fourth measure is climate change in two forms (BDA4a - Climate Change, and BDA4b Heat Spell Anomaly), which are measured as "the sum of the presence of a heat spell, cold spell, rain spell, or wind spell in an SA2 divided by 4" and "the rank of the heat spell anomalies in an SA2 divided by the total number of SA2 observations". They measure heat spells, cold spells, rain spells, and wind spells as the number of days on which the measured variable (temperature, rain, or wind) falls above (or below, for cold spells) the 'rolling mean 95th percentile' (it isn't clear what the term 'rolling mean 95th percentile' actually means). It isn't clear why adding those four up makes any sense, but perhaps you could just label them weather anomalies. In the second form of this measure, like BDA2 it isn't clear why the rank is used when the actual number of heat spells could be used instead. Again, this isn't really a direct measure of biodiversity, but to the extent that weather anomalies impede biodiversity, it may be a reasonable proxy.

The fifth measure (BDA5 - River Diversity) is "River condition × 100, where River condition = Percentage of insect and related species in an SA-located river compared to all possible species". This is probably the clearest actual biodiversity measure in the paper. However, it is still a narrow one, because although it captures the presence of insect and related species in rivers, it doesn't capture biodiversity more generally. It also doesn't consider the abundance of species.

The sixth measure (BDA6 - Drinking Water) is "An indicator of the average improvement (higher BDA) or deterioration (lower BDA) in drinking water quality in a region based on periodic water testing". This measure is not a stock, it is a flow. It is a change over time, which gives no indication of the stock available for businesses to use in production. Since Griffin and Lubberink are interested in natural capital as a stock, it would have been better to use the level of drinking water quality, rather than the change in drinking water quality over time. This measure also has problems of reverse causality. Griffin and Lubberink use their measures as if they are business inputs. However, water quality is likely an output of business. Consider a dairy farm that reduces the water quality in a nearby stream. They have the causal relationship backwards when this variable is included in the analysis.

The seventh measure (BDA7 - Plant Diseases) is "1 - percentage of plant diseases in an SA-unit compared to all possible plant diseases". Let's put aside the impossibility of measuring "all possible plant diseases". This might be a useful measure of (the lack of) biodiversity, but it would be better to directly measure plant biodiversity, rather than proxying for it by plant diseases.

The eighth measure (BDA8 - Matauranga) is "Percentage of SA2 population of Māori descent". This is a socio-cultural proxy for relationships with nature, not a measure of biodiversity.

The ninth measure (BDA9 - Population Density) is "1 - the rank of the population density in an SA2 divided by the total number of SA2 observations". Again, it isn't clear why the rank is used here, rather than actual population density. Also, like BDA2 this is a measure of urbanicity, not biodiversity.

The tenth measure (BDA10 - Possum Count) is "1 - the rank of the possum count in an SA2 divided by the total number of SA2 observations". Again, it isn't clear why the rank is used here, rather than some standardised measure of the actual possum count, or possums per land area. It is an indicator of biodiversity though, since more possums would typically mean fewer of other species.

Finally, the eleventh measure (BDA11 - Non-Drought Probability) is "1 minus the ratio of the number of drought weather events in an SA divided by the sum of the number of drought plus non-drought weather events in an SA2". It's not clear what a 'non-drought weather event' is, or why this is a sensible measure. This measure is probably correlated with the climate change measures in BDA4 in any case.

So, across the eleven (or twelve, if you treat the two BDA4 measures as separate) BDA measures, there are only three that are really measures of biodiversity, and there are a few that are likely to meaningfully correlated with biodiversity. The issue is not that every variable must be a perfect direct measure of biodiversity. Empirical research often relies on proxy measures. The issue is that the interpretation should match the proxy. A variable that measures urbanicity, business density, ethnicity, or weather anomalies may be related to biodiversity, but it is not itself biodiversity. If those variables are then combined into a single measure of 'natural capital', the interpretation becomes difficult. The estimated relationship may reflect biodiversity, but it may also reflect a mix of urbanicity, industry mix, infrastructure, climate, or demographic composition. Conflating urbanicity with biodiversity is an especially clear problem for Griffin and Lubberink's analysis, given that they multiply their BDA measures by SA2 land area when constructing their overall measure of natural capital, as I noted earlier.

Finally, Griffin and Lubberink attempt to exploit what they describe as a quasi-natural experiment. The idea is that a number of government policy changes in 2016 and 2017 were intended to improve the environment. If these policies successfully increased biodiversity, then the relationship between biodiversity and business output should become stronger after those policies were implemented. However, this is not a particularly convincing identification strategy. The policies were national, so there is no obvious untreated control group within New Zealand. The test is essentially asking whether the relationship between natural capital and business output changed after 2016 or 2017. But many other things could also have changed around the same time, including macroeconomic conditions, industry conditions, investment decisions, business confidence, and local economic trends. Moreover, the policies themselves may have affected firms through channels other than biodiversity, not least through expectations about future policy changes. That makes it difficult to interpret any post-2016 or post-2017 change as evidence that biodiversity caused higher business productivity. This part of the analysis instead shows that the estimated association between natural capital and business output is not stable over time, and that might be due to policy changes or any number of other reasons.

There are other issues that I could pick out as well, such as not including SA2 fixed effects in their analysis (so that time-invariant differences between SA2s are not controlled for). To be fair, including SA2 fixed effects would absorb much of the cross-sectional variation in biodiversity that the authors are trying to use. But that is exactly the problem, because without SA2 fixed effects, the estimates may reflect other time-invariant differences between SA2s, and not differences in biodiversity.

The overall takeaway from this paper is not that correlation is not the same as causation, it is that if you want to demonstrate correlation, you first need to use the right data in the right way. Biodiversity might be good for business. Business might be good for biodiversity. This research doesn't convincingly estimate the relationship between biodiversity and business output.

Sex, Drugs and Economics

Wednesday, 3 June 2026

This research doesn’t convincingly show that biodiversity is good for business

No comments:

Post a Comment

Get new posts by email: