One of the first rules of working with real-world data is to graph it. That allows us to see where the data has weird inconsistencies, such as those described in this blog post by Kathleen Beegle over on the Development Impact blog:
Has the share of women in the labor force in Rwanda fallen from 84% in 2014 to 52% in 2019? I highly doubt it. More likely, we are seeing the consequence of a major change in the internationally agreed-upon statistical definition of employment. Yes, really. It is quite likely to be a change of which mainly statistical-type-super-data-nerd economists would be aware. But it is one that all of us might want or need to be aware of, especially if you want to properly use country statistics and/or benchmark your own surveys with estimates from national statistical agencies.
In 2013, the 19th International Conference of Labour Statisticians (ICLS) redefined several key labor statistics. It’s taken a few years for these new definitions/concepts to get integrated into questionnaires, into survey efforts, and, as I suspect above, to show up in country statistics. The ICLS19 made several changes but here I focus on one specific change: employment is now work for pay or profit. But wasn’t that was it was before? Not quite. A key feature to this change is that work that is mainly intended for “own-use production” is now excluded, where it was counted as employed before. Before ICLS19, production of primary products, whether for market or household consumption, was counted as employment. The new definition means that someone farming mainly for family consumption (i.e. subsistence farmers) is no longer “employed”. (Though they could still be employed by the new definition if they have another job that qualifies, and in the labor force if they being available and actively searching for work in the form of pay or profit). So subsistence farmers (or those otherwise growing crops mainly intended for home consumption) are “working”, but not “employed”. I put quotes to emphasize these specific terms.
Beegle identifies an interesting phenomenon, and one that we should be careful about - statistical agencies changing the definition of variables that we use. Our analyses would be confounded by these definitional changes if we don't account for them in some way. Now, if all statistical agencies applied the new definition at the same time, that would be easy, but it appears they haven't. Looking at the World Development Indicators, here's the data for five countries: (1) Rwanda; (2) Niger; (3) Papua New Guinea; (4) Benin; and (5) Cameroon (original data are here).
I deliberately chose these five countries as they all seem to experience a similar transition from a high steady-state labour force participation rate to a lower steady-state labour force participation rate. The problem is that all five countries make this transition at different times. So, the usual ways that we would deal with a break in the time series (such as by using a dummy variable for before/after the change in definition, or a before/after dummy variable interacted with a dummy variable for each country) simply isn't going to work well.
This graph also highlights something else, which is no doubt a feature of the underlying ILOSTAT data. Each transition is far too smooth. It is like a straight line has been drawn from the initial steady state series to the start of the new series. No doubt this period of linear transition is actually masking missing data between two consecutive labour force surveys.
We would need to take great care when using the annual data, because there is a great degree of measurement error created by the change in definition, as well as the way the data series have been smoothed between each labour force survey. This is a timely reminder of why the first step in any data analysis should be to graph the data, to help us understand what we are working with.
No comments:
Post a Comment