Starting in 1980 American political scientist Robert Axelrod ran a series of computer tournaments. He invited fellow researchers to submit computer codes (written in Fortran or Basic) that would play a very special game against each other: The iterated prisoner’s dilemma. The results of the tournament turned out to be quite remarkable. Simple strategies scored far better than elaborate ones. Strategies that seek to cooperate with other strategies did better than strategies designed to trick the other player. Even more interesting is the fact that when Axelrod launched a second tournament, the winning strategy of the first tournament was victorious once more. In this post, you will learn how you can run your own Axelrod tournaments in R.

The prisoner’s dilemma

Imagine two suspects being interrogated by the police. They sit in separate rooms and cannot communicate with another. The prosecutors don’t have enough evidence to convict them both on the principal charge. So, if both suspects refuse to cooperate, they can only be convicted on a lesser charge. The police, however, offer them to testify against each other. If one of the suspects accepts this deal, he will be set free and the other will face serious prison time. If they both betray each other, they both go to prison, but not as long. What should the suspects do?

Imagine that one of the suspects knew for sure that the other would never betray him. In that case he has to decide between being prosecuted on the lesser charge or no sentence at all. Obviously, the better choice is to cooperate with the police. Now image that the same suspect knew that the other will definitely betray him. In that case, he has to decide between going to prison for a very long time or a medium amount of time. Again, the better option is to cooperate with the police. So regardless of what the other suspect does, betrayal is the best option. Thus, if both suspects behave egoistically, they will both be prosecuted on the principal charge. Of course, had they refused to turn on each other, neither would have been prosecuted on the main charge. The two suspects’ situation is therefore known as the prisoner’s dilemma.

The iterated prisoner’s dilemma

Despite its name, the prisoner’s dilemma arises in many different situations. A prime example is the case of two companies that together dominate a particular industry (e.g., Airbus and Boeing, Coca Cola and Pepsi, or McDonalds and Burger King). The two competitors could cooperate with another (not with the police) and form a duopoly. They could then command significant price markups and boost their profits at the expense of the consumer. Of course, on any given day one of the two rivals might decide to turn on the other and lower its prices to increase its market share. Obviously, the other firm would retaliate and lower its prices too. The two firms are thus in the same dilemma as the two prisoners, except that the two firms play this game over and over again. What should they do?

Assume that the succession of games goes on forever. Unless the two firms have a very strong preference for high profits in the current period and care little for their long-term profits, they will probably choose to cooperate with another. But what if the game only went on for a finite number of rounds? In that case, in the last round of the game, the two players will choose to defect. This is because in the last round they face the traditional one-period prisoner’s dilemma from above and the other firm has no chance to retaliate in the future. This line of reasoning, however, implies that they will also choose to defect in the second-to-last round. Why? They both know that they will defect in the last round. Thus, neither player could retaliate a defection from the previous round. Thus, the two firms will also not cooperate in the penultimate round. Two perfectly rational players, would then apply the same kind of reasoning to all rounds of the game and find it optimal to never cooperate with the other player and instead defect all the time.

The Axelrod tournaments

Robert Axelrod’s tournaments worked as follows. The contestants would send in computer code for their strategy. Each strategy played 200 rounds of the iterated prisoner’s dilemma against every other strategy and itself and a strategy that consisted of purely random decisions. When playing against each other, strategies could involve random elements or be entirely based on the previous rounds of the game. They could, however, neither communicate with the other player nor make any kind of binding agreements. In a round where both players cooperated with another, they received a reward for mutual cooperation of \(r=3\) points. If one of the players betrayed the other, he obtained the temptation to defect reward of \(t=5\) points, while the other received the sucker’s payoff of \(s=0\). Lastly, if both players defected, the received the punishment of mutual defection of \(p=1\) points. Fourteen researchers submitted their strategies. The winning strategy was submitted by Anatol Rapoport and followed a very simple rule called TIT FOR TAT: In the first round of each game, it would choose to cooperate with the other player. Thereafter, it would simply play what the other player had played in the previous round. Axelrod published the results of the tournament in his article “Effective Choice in the Prisoner’s Dilemma”, which you can download here. Researchers were then invited to participate in a second tournament. To avoid certain end game effects, Axelrod slightly altered the way the games would end, but otherwise the tournament was identical. Sixty-two researchers participated. Many new strategies, some of them very sophisticated, had been developed, but again none could beat TIT FOR TAT, which had been re-submitted by Anatol Rapoport.

Some simple strategies in R

As said before, strategies can be history-dependent. Thus, they will be a function of each players’ previous decisions. In our case, this will be a simple two-column matrix called \(h\).

# example history of previous rounds
# "C": player cooperated
# "D": player defected
h <- matrix(sample(c("C","D"),10,T),ncol=2)
colnames(h) <- c("Player 1","Player 2")
h
##      Player 1 Player 2
## [1,] "C"      "C"     
## [2,] "D"      "C"     
## [3,] "C"      "C"     
## [4,] "C"      "D"     
## [5,] "D"      "D"

Moreover, with history-dependent strategies, we need a second functional argument \(p\) that denotes the player by which this strategy is played. We will use this argument to distinguish between the two players’ decisions in the history matrix. Obviously, in case a strategy is not history-dependent, this argument is technically not necessary. To be consistent in our coding, however, we will nonetheless introduce this argument in all of our functions.

# a strategy that always cooperates
ALLC <- function(p,h) return("C")

# a strategy that always defects
ALLD <- function(p,h) return("D")

# a strategy that plays entirely random
RANDOM <- function(p,h) return(sample(c("C","D"),1))

# a strategy that switches back and forth between "C" and "D"
ALTERNATE <- function(p,h) {
  if(nrow(h) %% 2 == 0) return("D")
  return("C")
}

# a strategy that cooperates until the other player defects
# thereafter the strategy defects all the time
FRIEDMAN <- function(p,h) {
  if(nrow(h) == 0) return("C")
  if(any(h[,-p] == "D")) return("D")
  return("C")
}

As you can see, FRIEDMAN is the only of the four strategies that depends on the game’s history. This strategy was submitted by James W. Friedman to Axelrod’s first tournament. In the first round of the game, i.e., when there are no rows in the history matrix, it returns “C” for cooperation. Thereafter, it defects (“D”) in case the history matrix contains at least one defection in the column of the opposing player.

Scoring and playing our first games

With our first strategies in hand, we are ready to play our first couple of games. This requires us to write a scoring function that turns a game’s history into a score for each player.

# payoff function
payoff <- function(h, r=3, t=5, s=0, p=1) {
  if(h[1] == "C" & h[2] == "C") result = c(r,r)
  if(h[1] == "C" & h[2] == "D") result = c(s,t)
  if(h[1] == "D" & h[2] == "C") result = c(t,s)
  if(h[1] == "D" & h[2] == "D") result = c(p,p)
  names(result) <- colnames(h)
  return(result)
}
# example history
h
##      Player 1 Player 2
## [1,] "C"      "C"     
## [2,] "D"      "C"     
## [3,] "C"      "C"     
## [4,] "C"      "D"     
## [5,] "D"      "D"
# payoffs of this history
t(apply(h, MAR=1, FUN=payoff))
##      [,1] [,2]
## [1,]    3    3
## [2,]    5    0
## [3,]    3    3
## [4,]    0    5
## [5,]    1    1

Now, we can write a function that plays a game of 200 turns between two strategies. This function takes four arguments, the two strategies, a vector of names for these strategies, and the number of turns to be played, which is by default equal to 200. The game function first creates an empty history matrix. It then loops through each turn and calls the two strategy functions into which it passes the player numbers and the game’s history. Finally, the complete history is scored and the scores are added up to determine the winner.

# game function
game <- function(STRATEGY1, STRATEGY2, snames=c("P1","P2"), nturns=200) {
  
  # empty history matrix
  history <- matrix(NA,nturns,2)
  colnames(history) <- snames

  # run game and fill in history matrix
  for(t in 1:nturns) {
    decision1 <- STRATEGY1(p=1, h=history[0:(t-1),,drop=F])
    decision2 <- STRATEGY2(p=2, h=history[0:(t-1),,drop=F])
    history[t,] <- c(decision1,decision2)
  }
  
  # compute payoffs and total scores
  payoffs <- t(apply(history, MAR=1, FUN=payoff))
  colnames(payoffs) <- snames
  totals <- colSums(payoffs)
  
  # return
  return(list("History"=history, "Totals"=totals))
  
}
# two very nice players
game(ALLC, ALLC, nturns=10)
## $History
##       P1  P2 
##  [1,] "C" "C"
##  [2,] "C" "C"
##  [3,] "C" "C"
##  [4,] "C" "C"
##  [5,] "C" "C"
##  [6,] "C" "C"
##  [7,] "C" "C"
##  [8,] "C" "C"
##  [9,] "C" "C"
## [10,] "C" "C"
## 
## $Totals
## P1 P2 
## 30 30
# an unfair game
game(ALLC, ALLD, nturns=10)
## $History
##       P1  P2 
##  [1,] "C" "D"
##  [2,] "C" "D"
##  [3,] "C" "D"
##  [4,] "C" "D"
##  [5,] "C" "D"
##  [6,] "C" "D"
##  [7,] "C" "D"
##  [8,] "C" "D"
##  [9,] "C" "D"
## [10,] "C" "D"
## 
## $Totals
## P1 P2 
##  0 50
# random moves vs. alternating
game(RANDOM, ALTERNATE, nturns=10)
## $History
##       P1  P2 
##  [1,] "C" "D"
##  [2,] "C" "C"
##  [3,] "C" "D"
##  [4,] "C" "C"
##  [5,] "C" "D"
##  [6,] "D" "C"
##  [7,] "D" "D"
##  [8,] "D" "C"
##  [9,] "D" "D"
## [10,] "C" "C"
## 
## $Totals
## P1 P2 
## 21 26

More strategies

# tit for tat
TFT <- function(p,h) {
  if(nrow(h) == 0) return("C")
  if(h[nrow(h),-p] == "C") return("C")
  return("D")
}

# tit for two tats
TF2T <- function(p,h) {
  if(nrow(h) < 2) return("C")
  if(all(h[(nrow(h)-1):nrow(h),-p] == "D")) return("D")
  return("C")
}

# tit for tat's sneaky brother
JOSS <- function(p,h) {
  if(nrow(h) == 0) return("C")
  if(h[nrow(h),-p] == "C") {
    return(sample(c("C","D"), 1, prob=c(.9,.1)))
  }
  return("D")
}

The first of these strategies is TIT FOR TAT. It cooperates in the first round and thereafter if its opponent has not defected in the previous round. A more generous version of this strategy is TIT FOR TWO TATS. It starts out with two rounds of cooperation and then only defects if its opponent defected in both previous rounds. Lastly, JOSS, a strategy submitted by Johann Joss, plays TIT FOR TAT most of the time. However, when its opponent has previously cooperated, it will reciprocate this cooperation only with a 90% chance. Thus, it works like regular TIT FOR TAT, but occasionally sneaks in a few additional defections. As it turns out, these still comparatively simple strategies can already produce some remarkable patterns.

# JOSS vs. TIT FOR TAT
set.seed(123) # set seed
game(JOSS, TFT, nturns=20)
## $History
##       P1  P2 
##  [1,] "C" "C"
##  [2,] "C" "C"
##  [3,] "C" "C"
##  [4,] "C" "C"
##  [5,] "C" "C"
##  [6,] "D" "C"
##  [7,] "C" "D"
##  [8,] "D" "C"
##  [9,] "C" "D"
## [10,] "D" "C"
## [11,] "C" "D"
## [12,] "D" "C"
## [13,] "C" "D"
## [14,] "D" "C"
## [15,] "C" "D"
## [16,] "D" "C"
## [17,] "D" "D"
## [18,] "D" "D"
## [19,] "D" "D"
## [20,] "D" "D"
## 
## $Totals
## P1 P2 
## 49 44

Both strategies start out cooperating. In turn 6, however, JOSS throws in one of its deviations. TIT FOR TAT immediately retaliates in turn 7. In turn eight, JOSS, which has now switched back to playing just like TIT FOR TAT, retaliates too. The actual TIT FOR TAT strategy interprets this as a new provocation and retaliates again in turn 9. This pattern now repeats itself until turn 17, when JOSS throws in its second deviation. Now, both strategies have defected in the same turn resulting in no more cooperation in any of the remaining turns. Of course, JOSS is not a particularly nice strategy. You might perceive it to be fair when its opponent retaliates JOSS’s sneaky defections. Nonetheless, it can pay off to be forgiving, which is what TIT FOR TWO TATS does.

# JOSS vs. TIT FOR TWO TATS
set.seed(123) # set seed
game(JOSS, TF2T, nturns=20)
## $History
##       P1  P2 
##  [1,] "C" "C"
##  [2,] "C" "C"
##  [3,] "C" "C"
##  [4,] "C" "C"
##  [5,] "C" "C"
##  [6,] "D" "C"
##  [7,] "C" "C"
##  [8,] "C" "C"
##  [9,] "C" "C"
## [10,] "C" "C"
## [11,] "C" "C"
## [12,] "D" "C"
## [13,] "C" "C"
## [14,] "C" "C"
## [15,] "C" "C"
## [16,] "C" "C"
## [17,] "C" "C"
## [18,] "C" "C"
## [19,] "C" "C"
## [20,] "C" "C"
## 
## $Totals
## P1 P2 
## 64 54

As we can see, JOSS again deviates in turn 6 and a second time in turn 12. This time, these defections, however, do not result in an escalation of mutual defections. While TIT FOR TAT only scored 44 points in total, TIT FOR TWO TATS scored 54. Of course, JOSS now also did better. In fact, the margin between JOSS and its opponent has also grown and most people would regard this as unfair. However, in absolute terms, TIT FOR TWO TATS’s forgiveness has paid off.

Tournaments

Of course, pairing each strategy against every other strategy by hand is quite tedious. Therefore, we need a function that can run an entire tournament for us. The function will take three arguments: A vector with the strategies participating in the tournament, a vector with their names, and the number of turns to be played in each game. The function then pairs each strategy against each other and let’s each strategy play against itself. Its output is a leaderboard that details how well each strategy did against its opponents and how many points it scored on average across all games.

# tournament function
tournament <- function(strategies, names=NULL, nturns=200) {
  
  # empty results matrix
  nstrat <- length(strategies)
  results <- matrix(NA,nstrat,nstrat)
  if(is.null(names)) names <- paste("Strategy", 1:nstrat)
  colnames(results) <- names
  rownames(results) <- names
  
  # fill matrix
  for(i in 1:nstrat){
    for(j in 1:nstrat){
      if(i > j) next # skip if already filled in via [j,i]
      result <- game(strategies[[i]],strategies[[j]],nturns=nturns)
      results[i,j] <- result$Totals[1]
      results[j,i] <- result$Totals[2]
    }
  }
  
  # average across all strategies
  avg <- apply(results,MAR=1,FUN=mean)
  avgpturn <- avg/nturns

  
  # order by average score
  order <- order(avg, decreasing = T)
  results <- results[order,] # order rows
  results <- results[,order] # order columns
  
  # add averages to results matrix
  results <- cbind(round(avg[order],1),round(avgpturn[order],2),results)
  colnames(results)[1:2] <- c("Avgerage","Avg./Turn")
  
  # return
  return(results)
  
}
# run tournament
set.seed(123)
strategies <- c(ALLC,ALLD,RANDOM,FRIEDMAN,TFT,JOSS,TF2T)
names <- c("ALLC","ALLD","RANDOM","FRIEDMAN","TFT","JOSS","TF2T")
tournament(strategies, names)
##          Avgerage Avg./Turn TF2T FRIEDMAN TFT ALLC RANDOM JOSS ALLD
## TF2T        500.9      2.50  600      600 600  600    376  532  198
## FRIEDMAN    491.9      2.46  600      600 600  600    599  245  199
## TFT         479.7      2.40  600      600 600  600    458  301  199
## ALLC        464.1      2.32  600      600 600  600    309  540    0
## RANDOM      423.0      2.12  631      104 458  794    443  424  107
## JOSS        389.1      1.95  642      245 306  640    469  223  199
## ALLD        370.3      1.85  208      204 204 1000    572  204  200

As we can see, it is the “friendly” strategies that score the most points. This is in fact one of the key results of Axelrod’s research. Strategies like TIT FOR TWO TATS, FRIEDMAN, or ALLC never defect first. This enables them to rack up many points when playing against themselves or other friendly strategies. Of course, the highest score in an individual match (1000) is obtained when ALLD exploits ALLC. However, when playing against itself or any of the other strategies, ALLD is typically trapped in a bad equilibrium with both players constantly defecting.

Even more strategies

Now it’s up to you to program you own strategies and have them play against each other in different tournaments. Below is some code to run some of the strategies that participated in the original Axelrod tournaments. You can read up on these strategies in Axelrod’s papers (see below) or his book “The Evolution of Cooperation”. Unfortunately, Axelrod doesn’t always describe the strategies very clearly. So, there is some room for interpretation when coding them.

# always play "D" once the other player has defected once
# but play "C" for the first ten rounds
DAVIS <- function(p,h) {
  if(nrow(h) < 10) return("C")
  if(h[nrow(h),-p] == "D") return("D")
  return("C")
}

# start out like tit for tat, but reciprocate cooperation only stochastically
# probability of cooperation decreases over time
FELD <- function(p,h,nturns=200) {
  if(nrow(h) == 0) return("C")
  if(h[nrow(h),-p] == "C") {
    prob <- seq(from=1,to=.5,length.out=nturns)[nrow(h)+1]
    return(sample(c("C","D"), 1, prob=c(prob,1-prob)))
  }
  return("D")
}

# play "C" in round 0
# if players played differently in the previous round, defect with chance 2/7
GROFMAN <- function(p,h) {
  if(nrow(h) == 0) return("C")
  if(h[nrow(h),p] != h[nrow(h),-p]) {
    return(sample(c("C","D"), 1, prob=c(2/7,5/7)))
  }
  return("C")
}

# decision is based on the previous three moves
# different weights and values are assigned to different moves and players
NYDEGGER <- function(p,h) {
  if(nrow(h) == 0) return("C") # 1st move
  if(nrow(h) == 1) { # 2nd move
    ifelse(h[1,-p] == "C", return("C"), return("D")) 
  }
  if(nrow(h) == 2) { # 3rd move
    if(identical(h, matrix(c("D","C","C","D"),2))) return("D")
    ifelse(h[2,-p] == "C", return("C"), return("D")) 
  }
  mat <- h[(nrow(h)-2):nrow(h),] # any other move
  points <- rep(NA,3)
  for(i in 1:3) {
    if(identical(mat[i,] , c("C","D"))) points[i] <- 1
    if(identical(mat[i,] , c("D","C"))) points[i] <- 2
    if(identical(mat[i,] , c("D","D"))) points[i] <- 3
  }
  points <- sum(points*c(16,4,1))
  if(points %in% c(1,6,7,17,22,23,26,29,30,31,33,38,39,45,49,54,55,58,61)) {
    return("D")
  }
  return("C")
}

# cooperate until the other defects, the retaliates once
# if the opponent defects again, retaliate twice and so on
SHUBIK <- function(p, h) {
  
  # first round
  if(nrow(h) == 0) return("C")
  
  # start of retaliation
  if(h[nrow(h),p] == "C" &  h[nrow(h),-p] == "D") return("D")
  
  # ongoing retaliation
  if(nrow(h) >= 2) {
    k = 0
    for(i in 2:nrow(h)) if(h[i-1,p] == "D" & h[i,p] == "C") k <- k+1
    if(h[nrow(h),p] == "D" & any(h[(nrow(h)-k):nrow(h),p] == "C")) return("D")
  }
  
  # else
  return("C")
}

# cooperate at first, then defect stochastically
# probability depends on the opponents previous ten moves
TULLOCK <- function(p,h){
  
  # fist eleven rounds
  if(nrow(h) < 11) return("C")
  
  # subsequent rounds
  shareD <- sum(h[((nrow(h)-9):nrow(h)),-p] == "D")/10
  return(sample(c("C","D"),T,prob=c(1-shareD,shareD)))
  
}

References and further reading

Axelrod, Robert (1980): “Effective Choice in the Prisoner’s Dilemma”, in The Journal of Conflict Resolution, Vol. 24, No. 1, pp. 3-25, https://www.jstor.org/stable/173932?seq=1.

Axelrod, Robert (1980): “More Effective Choice in the Prisoner’s Dilemma”, in The Journal of Conflict Resolution, Vol. 24, No. 3, pp. 379-403, https://www.jstor.org/stable/173638?seq=1.

Axelrod, Robert (1981): “The Emergence of Cooperation among Egoists”, in The American Political Science Review, Vol. 72, No. 2, pp. 306-318, https://www.jstor.org/stable/1961366?seq=1.

Axelrod, Robert (1981): The Evolution of Cooperation, Basic Books, New York.