Tuesday 18 April 2023

Doing qualitative research at scale?

Qualitative research usually involves much smaller sample sizes than quantitative research. That is typically a necessary response to the differences in the nature of the research. Quantitative research analyses datasets in such a way that a dataset that is ten times larger does not take ten times as long to analyse. In contrast, qualitative research takes much longer when the number of research participants is greater. In other words, quantitative research scales much easier than qualitative research does.

The main constraint that makes qualitative research less scalable than quantitative research is researcher time and effort. In qualitative research, every interview, focus group, participant observation, or whatever the unit of analysis is, must be analysed individually by the researcher. Unlike quantitative research, this process cannot be automated easily. Sure, there are tools to help with coding qualitative data, but they help with the management of the process, more so than the analysis itself. On top of that, different researchers may code the data in slightly different ways, meaning that it is not easy to increase the size of qualitative research by simply increasing the number of team members. The larger the qualitative research team, the larger the discrepancies between different coders are likely to be.

But, what if there was a way to easily scale qualitative research, allowing a smaller number of researchers (perhaps even one) to analyse a larger number of observations? Wouldn't that be great? That is the premise behind this paper, discussed in a relatively accessible way in this blog post on the Development Impact blog by Julian Ashwin and Vijayendra Rao (two of the seven co-authors of the paper). Specifically, they:

...develop a “supervised” [Natural Language Processing] method that allows open-ended interviews, and other forms of text, to be analyzed using interpretative human coding. As supervised methods require documents to be “labelled”, we use interpretative human coding to generate these labels, thus following the logic of traditional qualitative analysis as closely as possible. Briefly, a sub-sample of the transcripts of open-ended interviews are coded by a small team of trained coders who read the transcripts, decide on a “coding-tree,” and then code the transcripts using qualitative analysis software which is designed for this purpose. This human coded sub-sample is then used as a training set to predict the codes on the full, statistically representative sample. The annotated data on the “enhanced” sample is then analyzed using standard statistical analysis, correcting for the additional noise introduced by the predictions. Our method allows social scientists to analyze representative samples of open-ended qualitative interviews, and to do so by inductively creating a coding structure that emerges from a close, human reading of a sub-sample of interviews that are then used to predict codes on the larger sample. We see this as an organic extension of traditional, interpretative, human-coded qualitative analysis, but done at scale.

Natural Language Processing (NLP) is one of the cool new toys in the quantitative researcher's toolkit. It allows the analysis of "text as data" (see this paper in the Journal of Economic Literature, or this ungated earlier version, for a review). NLP models have been used in a wide range of applications, such as evaluating the effect of media sentiment on the stock market, or using web search data to estimate corruption in US cities.

Anyway, back to the paper at hand. Ashwin and Rao report that in their paper they:

...apply this method to study parents’ aspirations for their children by analyzing data from open-ended interviews conducted on a sample of approximately 2,200 Rohingya refuges and their Bangladeshi hosts in Cox’s Bazaar, Bangladesh.

The actual application itself is less important here than the method, about which they conclude that:

This illustrates the key advantage of our method – we are able to use the nuanced and detailed codes that emerge from interpretative qualitative analysis, but at a sample size that allows for statistical inference.

Now, I don't doubt that this is a very efficient way of analysing a lot of qualitative data, without the need for a huge amount of researcher time. However, I imagine that many qualitative researchers would argue that the method that this paper employs is not qualitative research at all. It simply takes qualitative data, constructs quantitative measures from it, and then applies quantitative analysis methods to it. This is a trap that many quantitative researchers fall into when faced with open-ended survey responses. Indeed, it is a trap that I have fallen into myself in the past, so now I partner with researchers with skills in applying qualitative methods when undertaking research that involves both qualitative and quantitative analyses (see here and here, for example). However, Ashwin and Rao are not oblivious to this problem:

Unsupervised NLP analysis provides too coarse of a decomposition of the text, which may not be suited to many research questions, as we show by comparing our results to those using a Structural Topic Model. This topic model shows that there are clearly differences in the language used by, for instance, hosts and refugees. However, interpreting these differences in terms of aspirations, ambition and navigational capacity is difficult. Unsupervised methods can thus uncover interesting dimensions of variation in text data, but they will often not give interpretable answers to specific research questions.

So, indeed there is still a role for qualitative researchers. At least, until ChatGPT takes over, at which time qualitative research may truly scale.

No comments:

Post a Comment