Paola Giuliano (UCLA) and Nathan Nunn (Harvard) have a new dataset that, as far as I can see, has enormous potential for looking at a wide range of questions in development, as well as providing a host of candidate variables for use as instruments in otherwise-unrelated analyses. The development of the dataset is described in an article published earlier this year in the journal Economic History of the Developing Regions (ungated version here). The dataset itself is available from Nathan Nunn's website here.
The journal article by Giuliano and Nunn explains:
We contribute to this line of research by providing a publicly accessible database that measures the economic, cultural, political, and environmental characteristics of the ancestors of current population groups... Specifically, we construct measures of the average pre-industrial characteristics of the ancestors of the populations in each country of the world. The database is constructed by combining preindustrial ethnographic information for approximately 1,300 ethnic groups with information on the current distribution of approximately 7,500 language groups measured at the grid-cell level.Giuliano and Nunn then go on to describe the dataset, as well as providing illustrations of the data. What particularly caught my eye was a brief analysis they did of the relationship between their historical geographic characteristics (meaning the average ancestral characteristics of populations living in current countries) and current GDP. They find that:
Not surprisingly, being further from the equator is positively associated with real per capita GDP. However, what is more surprising is that the ancestral measure appears to be much more strongly correlated than the contemporary measure. This is particularly striking since we would expect the ancestral measure to be more imprecisely measured than the contemporary measure.They find similar results for ancestral ruggedness of the land, and ancestral distance from the coastline. The reason these results caught my eye was that it suggests to me that these variables might be suitable instruments for GDP in other analyses (such as when GDP would be endogenous in the particular model you are trying to run. If that was a bit too pointy-headed for you, don't worry. It just suggests that these variables have a lot of potentially cool uses for economists.
[HT: Marginal Revolution]