3 minute read

A friend brought up the classic Monty Hall problem. During our discussions, I realized it’s interesting to put the problem under the lens of game theory.

Refresher

You’re on a game show. The host shows you three doors, and you win a car iff you open the right door. After you name your choice (door 1), the host opens door 2, showing that it’s not the right one. Now you’re given the choice to switch to door 3. Should you do it?

Person A says it doesn’t matter, since 1 and 3 are still equally likely to be the right door. Why would eliminating door 2 make that untrue?

Person B says we should switch. Before the host opened door 2, door 1 has 1/3 chance to be the right one, i.e. there is 2/3 chance that you’re wrong. If you were wrong, switching would make you right, so door 3 now has 2/3 chance of being right!

This apparent paradox has caused heated debates throughout the years.

A two-player zero-sum game

It turns out that what you should do depends on what the host was thinking when they opened door 2.

Let us reframe the game as a two-player zero-sum game. The contestant picks a door (name it 1), then the host either opens door 1 or another door (name it 2), showing that it’s not the correct door. Now the contestant can pick one of the two unopened doors.

The contestant’s strategy can be characterized with a single probability p. If the host opens door 1, obviously the contestant can only guess among the other two doors, getting it right 1/2 of the time. If the host opens door 2, the contestant can switch with probability p.

The host’s strategy can also be characterized with a single probability q. If door 1 is correct, then the host doesn’t have any choice but to open one of the remaining two doors at random. Otherwise, the host can open door 1 with probability q, forcing the contestant to guess.

The contestant’s probability of winning would then be:

P(door 1 is right) * P(contestant doesn't switch)
+ P(door 1 is wrong) * P(host opens door 1) * P(contestant guesses right)
+ P(door 1 is wrong) * P(host opens door 2 or 3) * P(constestant switches)
= 1/3 * (1 + p + q - 2pq)

If the host wants to minimize this probability by changing q, then it depends the sign of 1-2p. If p < 1/2, the host would want q to be minimized, otherwise maximized. The contestant faces an analogous situation: if q < 1/2, p should be maximized, otherwise minimized. In other words, if p is small, q should be small, which should make p large, which should make q large… The only equilibrium is when p = q = 1/2, which makes the probability of winning 1/2. This intuitively makes sense: either player is just randomly guessing, and the result is the same as flipping a coin.

The game show, hypothetically

In the actual game show, the host never opens the contestant’s chosen door, i.e. q = 0. By the above analysis, p should be 1, resulting in a winning probability of 2/3. One way to think about this classic result is that the host is playing a bad strategy and getting exploited by the contestant.

But as a thought experiment, what if you’re the first ever contestant, and you don’t know what the host is thinking when you see door 2 being opened?

  • If q = 0 then switching is 2/3 of a prize;
  • But if q = 1 then switching is 1/3;
  • What if the host only opens the second door if you were correct on the first try? Then switching always loses.
  • What are the host’s incentives? Does the show want to save on the prizes, or is giving more prizes better for ratings?
  • What is the average contestant expected to do? Does that affect the design of the game?

How do we stay sane when playing games when the rules are unknown?