I read an interesting paper today on the topic, by Philip Hersch and Jodi Pelkowski (both Wichita State University) and published in Applied Economics Letters (sorry I don't see an ungated version anywhere online). In the paper the authors look at voting data from three referendums on fluoridation - two referendums in Wichita, Kansas (in 1978 and 2012) and one in Portland, Oregon (in 2013) - and attempt to determine "the factors driving voter demand for and against fluoridation". Which would be great, if the authors had access to the data from individual voters, but they don't. Instead they use data aggregated to voting precincts. And this makes their interpretation of results susceptible to the ecological fallacy.
The ecological fallacy arises when inferences about individuals are drawn from an analysis of groups to which those individuals belong, or an analysis of geographical areas in which those individuals live. There are a number of papers which explain the fallacy (e.g. try this one (gated), or this one (ungated) for a more mathematical treatment). However, it is simplest to explain with an example.
One of the most widely cited examples of the ecological fallacy is drawn from the work of Emile Durkheim on suicide in 19th Century Prussia. Durkheim found that suicide rates were higher in provinces that had higher proportion of Protestants in the population (and lower proportion of Catholics). If we infer from this result that Protestants were at higher risk of committing suicide, we would be committing the ecological fallacy (by inferring that the group-level analysis tells us something about the individuals belonging to each group). It may be that Catholics living in provinces surrounded by many Protestants were more likely to commit suicide. The group-level (province-level) analysis cannot distinguish these two possibilities (or indeed, a number of other possible explanations).
Coming back to the fluoridation voting paper, because the analysis was conducted at the voting precinct level, nothing can be inferred about individual voters' preferences based on the precinct-level analysis. Hersch and Pelkowski are generally pretty good, however not always so. For example, they say:
%College is strongly positive in all regressions, indicating that educated voters are more likely to subscribe to the mainstream science.No, this is the ecological fallacy. What they should have said was: "%College is strongly positive in all regressions, indicating that voters who live in voting precincts that have a more educated population are more likely to subscribe to the mainstream science". Notice the difference?
Similarly, where Hersch and Pelkowski conclude:
Fluoride support is weaker among residents known to have been raised in a nonfluoridated environment, than for residents who were born elsewhere.They should instead have said: "Fluoride support is weaker in voting precincts where a greater proportion of the population is known to have been raised in a nonfluoridated environment, than in voting precincts where a greater proportion of residents were born elsewhere".
The takeaway message here is that, as researchers we need to be careful when we do analyses using geographical-level averages (as in my work on alcohol outlet density with Bill Cochrane, Michael Livingston and others), that we do not fallaciously draw inferences at the individual level from our results.
Finally, the results from the Hersch and Pelkowski article are interesting. Sticking with results that are common across all three referendums, voting precincts that have a higher proportion of people with a college education, and those that have a lower proportion of the population born in that state, are more likely to vote for fluoridation. The results for political affiliation were more mixed, with some suggestive results that voting precincts with a higher proportion of libertarian and conservative voters (in Wichita 1978) and precincts with a higher proportion of Green Party voters (in Portland 2013) having lower proportions of voters in favour of fluoridation. Of course, we can't directly infer anything from the this last result about the voting preferences of libertarians or Green Party supporters. We would need individual voter-level data to do so.