Goldsmiths Company Mathematics Course for Teachers. Game Theory. Dr. Reto Mueller

Goldsmiths Company Mathematics Course for Teachers Game Theory Dr. Reto Mueller r.mueller@qmul.ac.uk

Game Theory is the study of strategic decision making Game Theory is the study of mathematical models of conflict and cooperation between intelligent rational decision-makers. An alternative name for this theory is interactive decision theory. Game theory is mainly used in economics, political science, and psychology, as well as logic, computer science, and biology. A game is usually represented by a matrix or table which shows the players, strategies, and payoffs.

What is the best strategy? Which strategy should the players pick? Let us look at the previous game again... A \ B left right up 4 \ 3 2 \ 0 down 3 \ 2 1 \ 1

What is the best strategy? Which strategy should the players pick? Let us look at the previous game again... A \ B left right up 4 \ 3 2 \ 0 down 3 \ 2 1 \ 1 A \ B left right up 0 \ 3 2 \ 2 down 1 \ 1 3 \ 0 In the first game, it is clear that the players will choose (up, left), maximising the profit for both of them. But what about the second game? Player A does better with the down strategy (no matter what B picks) and player B does better with the left strategy (no matter what A picks). So will they end up picking (down, left) despite the fact that with (up, right) they would both profit more?

Nash equilibrium Definition: Informally, a set of strategies is a Nash equilibrium if no player can do better by unilaterally changing his or her strategy.

Nash equilibrium Definition: Informally, a set of strategies is a Nash equilibrium if no player can do better by unilaterally changing his or her strategy. To see what this means, imagine that each player is told the strategies of the others and then asks himself or herself: Knowing the strategies of the other players, can I benefit by changing my strategy? If any player would answer Yes, then that set of strategies is not a Nash equilibrium. But if every player prefers not to switch (or is indifferent between switching and not) then the set of strategies is a Nash equilibrium.

Nash equilibrium If a game has one Nash equilibrium, then all players will play the Nash equilibrium strategy if 1. The players all will do their utmost to maximize their expected payoff as described by the game. 2. The players are flawless in execution. 3. The players have sufficient intelligence to deduce the solution. 4. The players know the strategy of all of the other players. 5. The players believe that a deviation in their own strategy will not cause deviations by any other players. 6. There is common knowledge that all players meet these conditions, including this one. So, not only must each player know the other players meet the conditions, but also they must know that they all know that they meet them, and know that they know that they know that they meet them, etc.

Example 1: Prisoner s Dilemma Imagine two prisoners held in separate cells, interrogated simultaneously, and offered deals (lighter jail sentences) for betraying their fellow criminal. [They both go to prison for one year for stealing a car, but the police thinks they also robbed a jewellery shop together.] A \ B stay quiet betray the other stay quiet -1 \ -1-3 \ 0 betray the other 0 \ -3-2 \ -2 While both players would do better staying silent, the rational behaviour for both of them is to betray the other! A Prisoner s Dilemma has a unique Nash equilibrium (which is not however not optimal).

Example 1: Prisoner s Dilemma in the real world 1. Doping in sport. If none of two athletes takes doping, then neither gains an advantage over the other. If only one does, then that athlete gains a significant advantage over the competitor. If both athletes take the drug, however, the benefits cancel out and only the drawbacks remain (the drug might be illegal and dangerous), putting them both in a worse position than if neither had used doping.

Example 1: Prisoner s Dilemma in the real world 2. Arms race. During the Cold War the opposing alliances had the choice to arm or disarm. From each side s point of view, disarming whilst their opponent continued to arm would have led to military inferiority and possible annihilation. Conversely, arming whilst their opponent disarmed would have led to superiority. If both sides chose to arm, or if both sides chose to disarm, war can be avoided. However, arming comes at the high cost of developing and maintaining a nuclear arsenal... 3. Advertising. If only one of two competing companies advertises its product [or invests more in advertising], it gains a huge advantage over the other. If they advertise at the same level (including not advertising at all), then none of them improves compared to the other. But advertising comes at some high cost.

Example 2: Coordination game A typical case for a coordination game is choosing the sides of the road upon which to drive, a social standard which can save lives if it is widely adhered to. In a simplified example, assume that two drivers meet on a narrow dirt road. Both have to swerve in order to avoid a head-on collision. If both execute the same swerving maneuver they will manage to pass each other, but if they choose differing maneuvers they will collide. A \ B left right left 1 \ 1-5 \ -5 right -5 \ -5 1 \ 1

Example 2: Coordination game - Variants 1. Party or stay home?

Example 2: Coordination game - Variants 1. Party or stay home? A \ B Party Stay home Party 5 \ 5-2 \ -2 Stay home -2 \ -2 1 \ 1

Example 2: Coordination game - Variants 1. Party or stay home? A \ B Party Stay home Party 5 \ 5-2 \ -2 Stay home -2 \ -2 1 \ 1 2. Change to compatible technologies?

Example 2: Coordination game - Variants 1. Party or stay home? A \ B Party Stay home Party 5 \ 5-2 \ -2 Stay home -2 \ -2 1 \ 1 2. Change to compatible technologies? A \ B Change Don t Change 5 \ 5-2 \ 2 Don t 2 \ -2 1 \ 1 3. Battle of the sexes (BoS also Bach or Stravinsky )

Example 3: Anti-Coordination Game Chicken The game chicken is a game in which two drivers drive towards each other on a collision course: one must swerve, or both may die in the crash, but if one driver swerves and the other does not, the one who swerved will be called a chicken, meaning a coward. A \ B Swerve Don t Swerve 0 \ 0-1 \ 1 Don t 1 \ -1-10 \ -10

Example 4: Competition game This can be illustrated by a two-player game in which both players simultaneously choose an integer from 0 to 3 and they both win the smaller of the two numbers in points. In addition, if one player chooses a larger number than the other, then he/she has to give up two points to the other. A \ B 0 1 2 3 0 0 \ 0 2 \ -2 2 \ -2 2 \ -2 1-2 \ 2 1 \ 1 3 \ -1 3 \ -1 2-2 \ 2-1 \ 3 2 \ 2 4 \ 0 3-2 \ 2-1 \ 3 0 \ 4 3 \ 3

Example 5: Network traffic Every traveler has a choice of 3 strategies, where each strategy is a route from A to D (either ABD, ABCD, or ACD).

Example 5: Network traffic Every traveler has a choice of 3 strategies, where each strategy is a route from A to D (either ABD, ABCD, or ACD). The payoff of each strategy is the travel time of each route. In the graph above, a car travelling via ABD experiences travel time of (1 + x/100) + 2, where x is the number of cars traveling on edge AB. Thus, payoffs for any given strategy depend on the choices of the other players, as is usual. However, the goal in this case is to minimize travel time, not maximize it.

Example 5: Network traffic Equilibrium will occur when the time on all paths is exactly the same. When that happens, no single driver has any incentive to switch routes, since it can only add to his/her travel time. For the graph above, if, for example, 100 cars are travelling from A to D, then equilibrium will occur when 25 drivers travel via ABD, 50 via ABCD, and 25 via ACD. Every driver now has a total travel time of 3.75 (to see this, note that a total of 75 cars take the AB edge, and likewise 75 cars take the CD edge).

Example 5: Network traffic Nash Equilibrium: 25 drivers via ABD, 50 via ABCD, and 25 via ACD. Every driver now has a total travel time of 3.75.

Example 5: Network traffic Nash Equilibrium: 25 drivers via ABD, 50 via ABCD, and 25 via ACD. Every driver now has a total travel time of 3.75. BUT if the 100 cars agreed that 50 travel via ABD and the other 50 through ACD, then travel time for any single car would actually be 3.5 which is less than 3.75. This is also the Nash equilibrium if the path between B and C is removed, which means that adding an additional possible route can decrease the efficiency of the system (Braess s paradox).

Is there always a Nash equilibrium? In the matching pennies game, both players reveal a penny at the same time. If they match (i.e. both show heads or both show tails), Player A gets both of them. If the don t match, Player B gets them. So the payoff matrix looks as follows: A \ B Heads Tails Heads 1 \ -1-1 \ 1 Tails -1 \ 1 1 \ -1 Is there a Nash equilibrium?

Mixed strategies The answer to the question is: There is always a Nash equilibrium if one allows mixed strategies, i.e. an assignment of a probability to each pure strategy. [Note: Since probabilities are continuous, there are infinitely many mixed strategies available to a player, even if there are only finitely many pure strategies.] In the game with two players having two strategies, this is easy: Player A chooses with probability p (where 0 p 1) the first and with probability (1 p) the other strategy. Player B picks with probability q (where 0 q 1) the first strategy and with probability (1 q) the second strategy.

Matching pennies A \ B H T qh+(1-q)t H 1 \ -1-1 \ 1 2q-1 \ 1-2q T -1 \ 1 1 \ -1 1-2q \ 2q-1 ph+(1-p)t 2p-1 \ 1-2p 1-2p \ 2p-1... \...

Matching pennies A \ B H T qh+(1-q)t H 1 \ -1-1 \ 1 2q-1 \ 1-2q T -1 \ 1 1 \ -1 1-2q \ 2q-1 ph+(1-p)t 2p-1 \ 1-2p 1-2p \ 2p-1... \... In order for Player B to be willing to randomize, Player A should play in such a way that the expected payoff for each pure strategy of B (and hence also every mixed strategy) should be the same. So player A picks p in such a way that 1 2p = 2p 1, i.e. p = 1/2. Similarly, Player B picks q = 1/2.

Mixed strategies in previous games Coordination game: A \ B left right left 1 \ 1-5 \ -5 right -5 \ -5 1 \ 1

Mixed strategies in previous games Coordination game: A \ B left right left 1 \ 1-5 \ -5 right -5 \ -5 1 \ 1 pl+(1-p)r... \ p-5(1-p)... \ -5p+(1-p) We solve 5 + 6p = 1 6p, which gives p = 1/2. From symmetry reasons, Player B then picks q = 1/2.This means that a mixed strategy is randomly going either left or right to avoid the crash. Surprisingly, this is a Nash equilibrium! A \ B L R 1 2 L + 1 2 R L 1 \ 1-5 \ -5-2 \ -2 R -5 \ -5 1 \ 1-2 \ -2 1 2 L + 1 2R -2 \ -2-2 \ -2-2 \ -2

Mixed strategies in previous games Battle of the sexes: A \ B Football Opera Football 5 \ 3 0 \ 0 Opera -2 \ -2 3 \ 5

Mixed strategies in previous games Battle of the sexes: A \ B Football Opera Football 5 \ 3 0 \ 0 Opera -2 \ -2 3 \ 5 pf+(1-p)opera... \ 3p-2(1-p)... \ 0+5(1-p) We solve 2 + 5p = 5 5p, which gives p = 0.7. From symmetry reasons, Player B then picks his favourite option (namely Opera) with probability 0.7.This means that a mixed strategy is going with probability 0.7 to your favourite event! A \ B F O 0.3F + 0.7O F 5 \ 3 0 \ 0 1.5 \ 1.5 O -2 \ -2 3 \ 5 1.5 \ 1.5 0.7F + 0.3O 1.5 \ 1.5 1.5 \ 1.5 1.5 \ 1.5

The Final Problem From the book The Final Problem... where Holmes and Watson attempted to flee England to get away from Professor Moriarty, who was pursuing Holmes for the purpose of killing him before Holmes investigation could bring down the Professor and his criminal empire. Holmes elected to flee to the European continent while the London police did their work to wrap up the case he had provided for them. Holmes and Watson took a train and headed to Dover, where they could board a steamer to Calais and leave Moriarty behind and vulnerable. Holmes knew the train schedule, as did Moriarty. There was an intermediate stop in Canterbury. By this time, Moriarty and his confederates were close upon the trail chasing Holmes.

The Final Problem Holmes \ Moriarty Out at C Go to D Out at C 0 \ 2 1 \ 1 Go to D 2 \ 0 0 \ 2 What is the strategy that they choose?

The Final Problem Holmes \ Moriarty Out at C Go to D Out at C 0 \ 2 1 \ 1 Go to D 2 \ 0 0 \ 2 What is the strategy that they choose? SOLUTION: Holmes gets out at Canterbury with probability 2/3. Moriarty on the other hand stays on the train with probability 2/3. SPOILER: In the book, Holmes gets indeed out at Canterbury while Moriarty s special pursuit train sped through hoping to catch Holmes in Dover and kill him there.