Friday, 4 April 2025

This week in research #69

Here's what caught my eye in research over the past week:

  • Altindag, Cole, and Filiz (with ungated earlier version here) find that students' academic performance is better when their race matches their teachers, but that this is only true for students who are younger than their teacher, and not for students who are a similar age or older than their teacher (role models clearly matter)
  • Calamunci and Lonsky (open access) find that, between 1960 and 1993, an Interstate highway opening in a county led to an 8% rise in total index crime, driven by property crime (burglary, larceny, and motor vehicle theft)
  • Achard et al. (open access) find that individuals living close to newly installed refugee facilities in the Netherlands developed a more positive attitude towards ethnic minorities and became less supportive of anti-immigration parties compared to individuals living farther away

Thursday, 3 April 2025

Mobile phone providers and the repeated switching costs game

This week, my ECONS101 class covered pricing and business strategy, and one aspect of that is switching costs and customer lock-in. Switching costs are the costs of switching from one good or service to another (or from one provider to another). Customer lock-in occurs when customers find it difficult (costly) to change once they have started purchasing a particular good or service. The main cause of customer lock-in is, unsurprisingly, high switching costs.

As one example, consider this article from the New Zealand Herald last month:

A new Commerce Commission study has found the switching process between telecommunications providers is not working as well as it should for consumers...

The study found 50% of mobile switchers and 45% of broadband switchers ran into at least one issue when switching.

The experience was so bad that 29% of mobile switchers and 27% of broadband switchers said they wouldn’t want to switch again in future...

The commission’s latest consumer satisfaction report found that 31% of mobile consumers and 29% of broadband consumers have not switched because it requires ‘too much effort to change providers’...

Gilbertson said a lack of comprehensive protocols between the “gaining” service provider and the “losing” service provider was a central issue with the current switching process.

This led to a number of problems, including double billing, unexpected charges, and delays.

The difficulty of changing from one mobile phone provider to another is a form of switching cost. It's not a monetary cost, but the time, effort, and frustration experienced by consumers wanting to switch makes the process of switching costly. And because the process is costly, mobile phone consumers are locked into their current provider.

It is clear why a mobile phone provider would want to make it difficult (costly) for its consumers to switch away from it and use some other provider. However, why don't mobile phone providers try to make it easier to switch to using their service instead? Maybe they could have staff whose role is to help consumers to navigate the process of switching to their service. That would allow the mobile phone provider to attract consumers and capture a greater market share. The answer is provided by considering a little bit of game theory.

Consider the game below, with two mobile phone providers (A and B), each with two strategies ('Easy' to switch to, and 'Hard' to switch to). The payoffs are made-up numbers that might represent profits to the two providers.

To find the Nash equilibrium in this game, we use the 'best response method'. To do this, we track: for each player, for each strategy, what is the best response of the other player. Where both players are selecting a best response, they are doing the best they can, given the choice of the other player (this is the definition of Nash equilibrium). In this game, the best responses are:

  1. If Provider B chooses to make switching easy, Provider A's best response is to make switching easy (since 3 is a better payoff than 2) [we track the best responses with ticks, and not-best-responses with crosses; Note: I'm also tracking which payoffs I am comparing with numbers corresponding to the numbers in this list];
  2. If Provider B chooses to make switching hard, Provider A's best response is to make switching easy (since 8 is a better payoff than 6);
  3. If Provider A chooses to make switching easy, Provider B's best response is to make switching easy (since 3 is a better payoff than 2); and
  4. If Provider A chooses to make switching hard, Provider B's best response is to make switching easy (since 8 is a better payoff than 6).

Note that Provider A's best response is always to choose to make switching easy. This is their dominant strategy. Likewise, Provider B's best response is always to make switching easy, which makes it their dominant strategy as well. The single Nash equilibrium occurs where both players are playing a best response (where there are two ticks), which is where both providers make switching easy.

So, that seems to suggest that the mobile phone providers should be making switching to them easier. However, notice that both providers would be unambiguously better off if they chose to make switching hard (they would both receive a payoff of 6, instead of both receiving a payoff of 3). By both choosing to make switching easy, it makes both providers worse off. This is a prisoners' dilemma game (it's a dilemma because, when both players act in their own best interests, both are made worse off).

That's not the end of this story though, because the simple example above assumes that this is a non-repeated game. A non-repeated game is played once only, after which the two players go their separate ways, never to interact again. Most games in the real world are not like that - they are repeated games. In a repeated game, the outcome may differ from the equilibrium of the non-repeated game, because the players can learn to work together to obtain the best outcome.

So, given that this is a repeated game (because the providers are constantly deciding whether to make switching easier or not), both providers will realise that they are better off making switching harder, and receiving a higher payoff as a result. And unsurprisingly, that is what happens, and it doesn't require an explicit agreement between the players - the agreement is 'tacit' (it is understood by the providers without needing to be explicit). Each provider just needs to trust that the other providers will make switching hard (because there is an incentive for each provider to 'cheat' on this outcome). Any instance of cheating (by making switching easier) would be immediately known by the other providers, and the agreement would break down, making them all worse off. So, there is an incentive for all providers to keep switching hard for the consumers. Even a new entrant firm into the market, which might initially make it easy for consumers to switch to them in order to capture market share, would soon realise that they are then better off making switching more difficult (it is not so long ago (2009) that 2degrees was a new entrant in this market).

The Commerce Commission is correct that the difficulty of switching mobile phone providers (the switching cost) keeps consumers with their current provider (customer lock-in). The result is that the mobile phone providers can profit from increasing prices for their lock-in consumers. The only solution to this situation would be to find some way to force a breakdown of the tacit arrangement. Then the market would settle at the equilibrium of all providers making it easy to switch to them. This may be an instance where some regulation is necessary.

Tuesday, 1 April 2025

The emerging debate on Oprea's paper on complexity and Prospect Theory

Late last year, an article in the American Economic Review by Ryan Oprea caught my attention (and I blogged about it here). It purported to show that the key experimental results underlying Prospect Theory may in part be driven by the complexity of the experiments that are used to test them. These were extraordinary results. And when you publish a paper with extraordinary results, that could potentially overturn a large literature on a particular theory, then those results are going to attract substantial scrutiny. And indeed, that is what has happened with Oprea's paper.

The team at DataColada, most well-known for exposing the data fakery of Dan Ariely and Francesca Gino (and the resulting lawsuit, which was dismissed), have a new working paper, authored by Daniel Banki (ESADE Business School) and co-authors, looking at Oprea's results (see also the blog post on DataColada by Uri Simonsohn, one of the co-authors). To be clear before I discuss Banki et al.'s critique, they don't accuse Oprea of any misconduct. They mostly present an alternative view of the data and results that appears to contradict key conclusions that Oprea finds in his paper. Oprea has also provided a response to some of their critique.

I'm not going to summarise Oprea's original paper in detail, as you can read my comments on it here. However, the key result in the paper is that when presented with risky choices, research participants' behaviour was consistent with Prospect Theory, and when presented with choices that involved no risk at all but were complex in a similar way to the risky choices ('deterministic mirrors'), research participants' behaviour was also consistent with Prospect Theory. This suggests that a large part of the observed results that underlie Prospect Theory may arise because of the complexity of the choice tasks that research participants are presented with.

Banki et al. look at a number of 'comprehension questions' that Oprea presented research participants with, and note that:

...75% of participants made an error on at least one of the comprehension questions, such as erroneously indicating that the riskless mirror had risk.

Once the data from those research participants is excluded, Banki et al. show that research participant behaviour differs between lotteries and mirrors for the research participants who 'passed' the comprehension checks (by getting all four of the comprehension questions correct on their first try). This is captured in Figure 2 from Banki et al.'s paper:

The two panels on the left of Figure 2 show the results for the full sample, and notice that both lotteries (top panel) and mirrors (bottom panel) look similar in terms of results. In contrast, when the sample is restricted to those that 'passed' the comprehension checks, the results for lotteries and mirrors look very different. Which is what we would expect, if research participants are not 'fooled' by the complexity of the task.

Banki et al. provide a compelling reason why the results for the research participants who failed the comprehension checks looks the same for lotteries and mirrors: regression to the mean. As Simonsohn explains in the DataColada blog post, this arises because of the way that a multiple-price list works:

When the dependent variable is how much people value prospects, regression to the mean creates spurious evidence in line with prospect theory. When people answer randomly for 10% chance of $25, they overvalue it, because the “right” valuation is $2.50, and the scale mostly contains values that are higher than that. When people answer randomly for 90% chance of $25, they undervalue it, because the “right” valuation is $22.50 and the scale mostly contains values that are lower than that. Thus, random or careless responding will produce the same pattern predicted by prospect theory.

Oprea responds to both of these points, noting that:

...a range of imperfectly rational behaviors including noisy valuations, anchoring-and-adjustment heuristics, compromise heuristics and pull-to-the-center heuristics will all tend to produce prospect-theoretic patterns of behavior simply because of the nature of valuation. BSWW offer this possibility as an alternative to the Oprea (2024)’s account of his data, but in fact these are examples of exactly the types of cognitive shortcuts Oprea (2024) was designed to study.

In other words, Banki et al.'s results don't refute Oprea's results, but are very much in line with Oprea's. One thing that Oprea does take issue with is Banki et al.'s use of medians as the preferred measure of central tendency. Oprea uses the mean, and when reanalysing the data with the same exclusions as Banki et al., Oprea shows that the mean results look similar to the original paper. So, Banki et al.'s results are not simply driven by excluding the research participants who failed the comprehension checks, but also by switching from using the mean to using the median.

On that point, I'm inclined to agree with Banki et al. The median is often used in experimental economics, because it is less influenced by outliers. And if you look at Oprea's data, there are a lot of large outliers, which become quite influential observations when the mean is used as the summary statistic. However, the outliers are likely to be the observations you want to have the smallest effect on your results, not the largest effect.

Oprea also critiques Banki et al.'s interpretation of the comprehension questions. Oprea rightly notes that:

...it is important to emphasize that these training questions weren’t designed to measure beliefs (e.g., payoff confusion), and because of this they are poorly suited to the task BSWW repurpose it for, ex post. Indeed, evidence from the patterns of mistakes made in these questions suggests that overall training errors largely serve as a measure of the cognitive effort (an important ingredient in Oprea (2024)’s account) subjects apply to answering these questions, and that BSWW therefore substantially overestimate the level of payoff confusion with which subjects entered the experiment.

In other words, the 'comprehension questions' are not comprehension questions at all, but they are really 'training questions' that were used to train the research participants to understand the choice tasks that they would be presented with. And so, using those training questions overall as a measure of understanding misses the point, and seriously underestimates the amount of understanding of the task that research participants had by the time they had completed the training questions.

Oprea's response is good on this point. However, if the training questions had really done a good job of training the research participants, then all participants should have had a similar level of understanding by the end of the training questions, and there should be no detectable differences in behaviour between those with more, and those with fewer, 'failed' training questions. That wasn't the case - the behaviour of the research participants who made errors in training was much more likely to be the same for lotteries and mirrors than was the behaviour of research participants who made no errors. To clear this up, it would have been interesting to have research participants also complete 'comprehension questions' at the end of the experimental session, to see if they still understood the tasks they were being asked to complete. At that point, those failing the comprehension questions could be dropped from the dataset.

One point of Banki et al.'s critique that Oprea hasn't engaged with (yet, although he promises to do so in a future, more complete response), is their finding that a larger than 'usual' proportion of the research participants fail 'first order stochastic dominance' (FOSD). A failure of FOSD in this context means that a research participant valued a lottery (or mirror) lower than a similar lottery that was strictly better. For example, valuing a 90% chance of receiving $25 less than a 10% chance of receiving $25 is a failure of FOSD. Banki et al. show that:

We begin by examining G10 and G90. Violating FOSD here involves valuing the 10% prospect strictly more than the 90% one. Across all participants (N = 583), 14.8% violated FOSD for mirrors, and 13.9% for lotteries. These rates are quite high given that the prospects differ in expected value by a factor of nine.

Those failure rates are much higher than for other similar research studies. Banki et al. note an overall rate of 20.8 percent in the Oprea results, compared with an average of 3.4 percent across eight other highly cited studies. It will be interesting to see how Oprea responds to that point in the future.

This is an interesting debate so far. Oprea does a good job of summing up where this debate should probably go next:

Ultimately, however, these questions and ambiguities can only be fully resolved by further research. While BSWW’s critique has not convinced me that the interpretation offered in Oprea (2024) is mistaken, I am eager to see new experiments that deepen, alter, or even overturn this interpretation. First, concerns that the Oprea (2024)’s results are a consequence of the design being too confusing to yield insight can only really be resolved one way or another by followup experiments that vary his procedures, instructions and other design choices in such a way as to satisfy us that the Oprea (2024) results are (or are not) overfit to that design.

Indeed, more follow-up research is needed. Prospect Theory hasn't been overturned, yet (and as I noted in my earlier post, it is consistent with a lot of real-world behaviour). However, now we know that it may be vulnerable, and Oprea's paper provides a starting point for testing more thoroughly how much of the experimental results arise from complexity.

[HT: Riccardo Scarpa]

Read more: