I posted last month about consumers' willingness to pay for wine bullshit. That was based on research by Kevin Capehart, which was published in the Journal of Wine Economics. Now, Capehart has another new article published in the same journal on a related topic (sorry, I don't see an ungated version online). In this new article, Capehart follows up on earlier research by Coco Krumme (described here). Essentially, Capehart uses a dataset of 120,000 wine descriptions, and classifies them into 'high price' (over US$50) or 'low price' (under US$15). He then trains a Naive Bayesian Classifier to predict which wines belong to the low-price category and which belong to the high-price category, based on the words in their descriptions.
Capehart is mostly able to reproduce very similar results to the earlier Krumme work, but perhaps more interestingly:
...I find for the dataset studied here that there do seem to be two mostly distinct vocabularies for high- and low-priced wines. Out of the roughly 20,000 unique words used to describe the over-$50 and/or under-$15 wines, only 42% of those words overlap by being used to describe wines in both price categories. The remaining 58% of words are non-overlapping.
In other words, the descriptions of high-priced wines use very different words than the descriptions of low-priced wines. You might argue that is because high-priced wines have different characteristics than low-priced wines, their descriptions should include different vocabularies. However, the key question that isn't answered (and which Capehart alludes to in his conclusion) is, does the vocabulary relate to the quality of the wine, is it simply a signal of the price? In other words, do those who are writing the descriptions choose their vocabulary based on the price of the wine, or based on the quality of the wine? We'd need more research in order to answer that question, and more like Capehart's earlier work.
Read more:
No comments:
Post a Comment