## Sunday, 1 October 2017

### The magicians' dilemma and repeated games

Marginal Revolution University's latest video covers game theory, which is timely given that we covered this only a couple of weeks ago in my ECON100 class:

Unfortunately, like most treatments of game theory in principles of economics classes, it stops well short of what is possible. So, I want to take it further. The video does a good job of explaining dominant strategies, and correctly identifies the one Nash equilibrium. To confirm that this is the only Nash equilibrium, we should use the 'best response method'. To do this, we track: for each player, for each strategy, what is the best response of the other player. Where both players are selecting a best response, they are doing the best they can, given the choice of the other player (this is the definition of Nash equilibrium). Here's the game from the video (but note that I've made the payoffs easier to track by making more explicit which player gets which payoff):

And here are the best responses:
1. If Al cheats, Bob's best response is to cheat (since \$6000 is better than \$1000) [we track the best responses with ticks, and not-best-responses with crosses; Note: I'm also tracking which payoffs I am comparing with numbers corresponding to the numbers in this list];
2. If Al promises, Bob's best response is to cheat (since \$15,000 is better than \$10,000);
3. If Bob cheats, Al's best response is to cheat (since \$6000 is better than \$1000); and
4. If Bob promises, Al's best response is to cheat (since \$15,000 is better than \$10,000).
Note that Al's best response is always to cheat. This is her dominant strategy. Likewise, Bob's best response is always to cheat, which makes it his dominant strategy as well. The single Nash equilibrium occurs where both players are playing a best response (where there are two ticks), which is where both magicians choose to cheat.

However, that still isn't the end of this. That solution is for a non-repeated game. A non-repeated game is played once only, after which the two players go their separate ways, never to interact again. Most games in the real world are not like that - they are repeated games. In a repeated game, the outcome may different from the equilibrium of the non-repeated game. In this case, the two magicians probably make their choices every week, interacting with each other many times.

The 'best' choice for each magician in a repeated game may be to promise. That makes both magicians better off. However, this outcome relies on each magician being able to trust the other. How would they ensure trust? By agreeing to play the promise strategy, and then following through on the agreement. Each magician would develop a reputation for cooperation, and the other magician would then trust them. However, if either magician were to cheat, that trust would then be broken.

Robert Axelrod wrote, in The Evolution of Cooperation, about repeated prisoners' dilemmas (like the game presented in the MRU video). He found that the optimal strategy to ensure cooperation was a tit-for-tat strategy. That involves a player initially cooperating in the first play of the game, then copying the strategy of the other player from then onwards. So, if the other player cheats, you would then cheat in the next play of the game, thereby punishing them. And if they cooperate, you cooperate in the next play of the game, thereby rewarding them.

If you don't think rewards provide enough incentive, you might try and alternative - the grim strategy. This starts off the same as tit-for-tat (with cooperation), but when the other player cheats, you start cheating and never go back to cooperating again. This maximises the punishment for cheating. Of course, it only works if the other player knows (and credibly believes) that you are playing the grim strategy.

So, while the MRU video alludes to more nuance in game theory, you can now see there is a lot more to it than a simple Nash equilibrium.