Visualizing the Prisoner's Dilemma
Jul 9, 2017 · 8 minute read clojureinteractiveThe grid you see below is a simulation running in your browser of a game called The Prisoner’s Dilemma
. Each cell represents a player, and the colour of the cell indicates the strategy the player is using. Notice how the strategies that perform best gain more territory.
In this article we will walk through the code that runs the simulation, but first let’s explain the rules of the game:

each player makes a decision to either
cooperate
ordefect

the players cannot communicate with each other while making their decision

the points the players get after deciding are determined by the following table:
player 2 

cooperate 
defect 

player 1 
cooperate 
(3,3) 
(0,5) 
defect 
(5,0) 
(1,1) 
For example, if player 1 defects and player 2 cooperates, player 1 would get 5 points and player 2 would get 0 points. When the game is played more than once it is known as Iterated Prisoner’s dilemma and it’s what we are simulating in this article.
Now that the game is clear let’s take a look at the implementation. We will have 3 simple strategies:

alwayscooperate
 keep cooperating no matter what the opponent does 
alwaysdefect
 keep defecting no matter what the opponent does 
titfortat
 start by cooperating and subsequently do whatever the opponent did in the previous round
We can capture these strategies using a multimethod:
(defmulti play :strategy)
(defmethod play :alwayscooperate [_]
:cooperate)
(defmethod play :alwaysdefect [_]
:defect)
(defmethod play :titfortat [{:keys [opponentlastdecision]}]
(if (nil? opponentlastdecision)
:cooperate
opponentlastdecision))
(play {:strategy :titfortat
:opponentlastdecision :defect})
(Note: You can edit the code blocks directly in the browser. Just press Ctrl+Enter
to trigger reevaluation.)
Given any 2 strategies we can generates a sequence of results containing a score
and the two decisions
made:
(defn matchresults
([s1 s2] (matchresults s1 s2 nil nil))
([s1 s2 lm1 lm2]
(let [m1 (play {:strategy s1 :opponentlastdecision lm2})
m2 (play {:strategy s2 :opponentlastdecision lm1})
score (case [m1 m2]
[:cooperate :cooperate] [3 3]
[:defect :cooperate] [5 0]
[:cooperate :defect ] [0 5]
[:defect :defect ] [1 1])]
(lazyseq
(cons [score m1 m2] (matchresults s1 s2 m1 m2))))))
So if a player using titfortat
and another using alwaysdefect
play 4 rounds:
(take 4 (matchresults :titfortat :alwaysdefect))
For the first frame we generate random strategies:
(take 9
(repeatedly
#(randnth [:alwayscooperate :alwaysdefect :titfortat])))
This works, but let’s write a version that takes a strategydistribution
so that we can try out different densities of a particular strategy:
(def strategydistribution {:alwayscooperate 2
:alwaysdefect 1
:titfortat 5})
(defn randomplayerstrategies [strategydistribution]
(let [strategies (reducekv #(into %1 (repeat %3 %2))
[]
strategydistribution)]
(repeatedly #(randnth strategies))))
(take 9 (randomplayerstrategies strategydistribution))
Now that we can generate strategies we can render them to a canvas:
(defn renderplayerstrategies! [canvas strategies]
(let [ctx (.getContext canvas "2d")
width (.width canvas)
height (.height canvas)
gridwidth (js/Math.sqrt (count strategies))
cellwidth (/ width gridwidth)
cellheight (/ height gridwidth)]
(.clearRect ctx 0 0 width height)
(doseq [[i strategy] (mapindexed vector strategies)]
(set! (.fillStyle ctx) (case strategy
:alwayscooperate "green"
:alwaysdefect "red"
:titfortat "orange"))
(.fillRect ctx
(* (mod i gridwidth) cellwidth)
(* (quot i gridwidth) cellheight)
(dec cellwidth)
(dec cellheight)))))
Let’s try it out:
(def gridwidth 15)
(renderplayerstrategies!
(js/document.getElementById "canvasrender")
(take (* gridwidth gridwidth)
(randomplayerstrategies {:alwayscooperate 4
:alwaysdefect 4
:titfortat 2})))
As you can see in the diagram below, each player will play against his neighbours. In other words, there is a match
for each one of the red lines between the cells:
Given a certain gridwidth
we can generate a list of these matches
:
(defn comparegrid [g l r]
(let [c (compare (mod l g) (mod r g))]
(if (zero? c)
(compare l r)
c)))
(defn generatematches [gridwidth]
(let [noofplayers (* gridwidth gridwidth)
playershorizontal (range noofplayers)
playersvertical (sort (partial comparegrid gridwidth)
(range noofplayers))
players (concat playershorizontal playersvertical)]
(mapcat #(partition 2 1 %)
(partition gridwidth players))))
(def m (generatematches 3))
m
Given the matches
we can build a vector containing the neighbours
of a particular player:
(defn generateneighbours [matches]
(reduce (fn [m d]
(let [k (first d)
v (get m k [(first d)])]
(assoc m k (conj v (second d)))))
[]
(mapcat #(vector % (reverse %)) matches)))
(def n (generateneighbours m))
n
So for example the neighbours for player 4 would be:
(n 4)
Ok now given the strategies
and lastdecisions
from the previous matches, we can calculate the next set of results:
(defn matchesresults [playerstrategies matchlastdecisions noofmatchesperframe]
(reduce
(fn [results match]
(let [[strategyp1 strategyp2] (vals (selectkeys playerstrategies match))
[lastdecisionp1 lastdecisionp2] (matchlastdecisions match)
result (reduce (fn [[s1 _ _][s2 m1 m2]]
(vector (map + s1 s2) m1 m2))
(take noofmatchesperframe
(matchresults strategyp1 strategyp2 lastdecisionp1 lastdecisionp2)))]
(assoc results match result)))
{}
(keys matchlastdecisions)))
(def frame1
(into []
(take 9 (randomplayerstrategies {:alwayscooperate 1
:alwaysdefect 1
:titfortat 2}))))
(def mr
(matchesresults frame1
(zipmap m (repeat [nil nil]))
5))
mr
To get the final scores we will need to aggregate the results:
(defn getscores [matchesresults]
(into []
(vals
(reducekv (fn [scores match result]
(mergewith + scores (zipmap match (first result))))
(sortedmap)
matchesresults))))
(def s (getscores mr))
s
We calculate the strategies for the next frame using the strategies
and scores
from the current frame and the neighbours
vector:
(defn calculatenextstrategies [currentstrategies scores neighbours]
(into []
(mapindexed (fn [i s]
(let [n (neighbours i)
sn (selectkeys scores n)
m (val (apply maxkey val sn))
w (into {} (filter #(= (val %) m) sn))]
(if (w i)
s
(currentstrategies (randnth (keys w))))))
currentstrategies)))
(def frame2 (calculatenextstrategies frame1 s n))
frame2
Given any two frames with strategies we can find out which players have changed strategies:
(defn calculatechangedstrategies [previousstrategies nextstrategies]
(into #{}
(filter #(not= (previousstrategies %) (nextstrategies %))
(range (count previousstrategies)))))
(calculatechangedstrategies frame1 frame2)
And when extracting the lastdecisions
from the matchesresults
we reset to [nil nil]
those where the players changed strategies:
(defn getlastdecisions [matchesresults changedstrategies]
(reducekv (fn [scores match result]
(let [lastdecisions (if (some changedstrategies match)
[nil nil]
(rest result))]
(assoc scores match lastdecisions)))
{}
matchesresults))
Using the building blocks above we can build a lazy sequence where each element is a frame
in the simulation:
(defn frameseq
([gridwidth noofplays strategydistribution]
(let [matches (generatematches gridwidth)
neighbours (generateneighbours matches)
strategies (>> (randomplayerstrategies strategydistribution)
(take (* gridwidth gridwidth))
(into []))
lastdecisions (zipmap matches (repeat [nil nil]))]
(lazyseq
(cons strategies (frameseq neighbours strategies lastdecisions noofplays)))))
([neighbours previousstrategies previouslastdecisions noofplays]
(let [results (matchesresults previousstrategies previouslastdecisions noofplays)
scores (getscores results)
nextstrategies (calculatenextstrategies previousstrategies scores neighbours)
changedstrategies (calculatechangedstrategies previousstrategies nextstrategies)
nextlastdecisions (getlastdecisions results changedstrategies)]
(lazyseq
(cons nextstrategies (frameseq neighbours nextstrategies nextlastdecisions noofplays))))))
Now that we have defined the sequence it’s easy to render it:
(defn renderframes [strategyframes canvasid waitmsec noofframes]
(let [canvas (js/document.getElementById canvasid)]
(async.macros/go
(doseq [frame (take noofframes strategyframes)]
(.requestAnimationFrame js/window #(renderplayerstrategies! canvas frame))
(async/<! (async/timeout waitmsec))))))
We just take noofframes
frames from the sequence and for each one call requestAnimationFrame
with a timeout of waitmsec
between each frame. Feel free to experiment with the values below. Remember press Ctrl+Enter
to rerender after changing something:
(def strategyframes
(frameseq 15 5 {:alwayscooperate 1
:alwaysdefect 1
:titfortat 2}))
(renderframes strategyframes "canvaswalkthrough" 500 10)
The interactive code snippets in this article are powered by KLIPSE.