Visualizing the Prisoner's Dilemma
Jul 9, 2017 · 8 minute read clojureinteractiveThe grid you see below is a simulation running in your browser of a game called The Prisoner’s Dilemma. Each cell represents a player, and the colour of the cell indicates the strategy the player is using. Notice how the strategies that perform best gain more territory.
In this article we will walk through the code that runs the simulation, but first let’s explain the rules of the game:
- 
each player makes a decision to either cooperateordefect
- 
the players cannot communicate with each other while making their decision 
- 
the points the players get after deciding are determined by the following table: 
| player 2 | |||
| cooperate | defect | ||
| player 1 | cooperate | (3,3) | (0,5) | 
| defect | (5,0) | (1,1) | |
For example, if player 1 defects and player 2 cooperates, player 1 would get 5 points and player 2 would get 0 points. When the game is played more than once it is known as Iterated Prisoner’s dilemma and it’s what we are simulating in this article.
Now that the game is clear let’s take a look at the implementation. We will have 3 simple strategies:
- 
always-cooperate- keep cooperating no matter what the opponent does
- 
always-defect- keep defecting no matter what the opponent does
- 
tit-for-tat- start by cooperating and subsequently do whatever the opponent did in the previous round
We can capture these strategies using a multimethod:
(defmulti play :strategy)
(defmethod play :always-cooperate [_]
  :cooperate)
(defmethod play :always-defect [_]
  :defect)
(defmethod play :tit-for-tat [{:keys [opponent-last-decision]}]
  (if (nil? opponent-last-decision)
    :cooperate
    opponent-last-decision))
(play {:strategy               :tit-for-tat
       :opponent-last-decision :defect})
(Note: You can edit the code blocks directly in the browser. Just press Ctrl+Enter to trigger reevaluation.)
Given any 2 strategies we can generates a sequence of results containing a score and the two decisions made:
(defn match-results
  ([s1 s2] (match-results s1 s2 nil nil))
  ([s1 s2 lm1 lm2]
    (let [m1 (play {:strategy s1 :opponent-last-decision lm2})
          m2 (play {:strategy s2 :opponent-last-decision lm1})
          score (case [m1 m2]
                  [:cooperate :cooperate] [3 3]
                  [:defect    :cooperate] [5 0]
                  [:cooperate :defect   ] [0 5]
                  [:defect    :defect   ] [1 1])]
      (lazy-seq
         (cons [score m1 m2] (match-results s1 s2 m1 m2))))))
So if a player using tit-for-tat and another using always-defect play 4 rounds:
(take 4 (match-results :tit-for-tat :always-defect))
For the first frame we generate random strategies:
(take 9
  (repeatedly
    #(rand-nth [:always-cooperate :always-defect :tit-for-tat])))
This works, but let’s write a version that takes a strategy-distribution so that we can try out different densities of a particular strategy:
(def strategy-distribution {:always-cooperate 2
                            :always-defect    1
                            :tit-for-tat      5})
(defn random-player-strategies [strategy-distribution]
  (let [strategies (reduce-kv #(into %1 (repeat %3 %2))
                              []
                              strategy-distribution)]
    (repeatedly #(rand-nth strategies))))
(take 9 (random-player-strategies strategy-distribution))
Now that we can generate strategies we can render them to a canvas:
(defn render-player-strategies! [canvas strategies]
  (let [ctx (.getContext canvas "2d")
        width (.-width canvas)
        height (.-height canvas)
        grid-width (js/Math.sqrt (count strategies))
        cell-width (/ width grid-width)
        cell-height (/ height grid-width)]
    (.clearRect ctx 0 0 width height)
    (doseq [[i strategy] (map-indexed vector strategies)]
      (set! (.-fillStyle ctx) (case strategy
                                :always-cooperate "green"
                                :always-defect    "red"
                                :tit-for-tat      "orange"))
      (.fillRect ctx
                 (* (mod i grid-width) cell-width)
                 (* (quot i grid-width) cell-height)
                 (dec cell-width)
                 (dec cell-height)))))
Let’s try it out:
(def grid-width 15)
(render-player-strategies!
  (js/document.getElementById "canvas-render")
  (take (* grid-width grid-width)
        (random-player-strategies {:always-cooperate 4
                                   :always-defect    4
                                   :tit-for-tat      2})))
As you can see in the diagram below, each player will play against his neighbours. In other words, there is a match for each one of the red lines between the cells:
Given a certain grid-width we can generate a list of these matches:
(defn compare-grid [g l r]
 (let [c (compare (mod l g) (mod r g))]
   (if (zero? c)
     (compare l r)
     c)))
(defn generate-matches [grid-width]
  (let [no-of-players (* grid-width grid-width)
        players-horizontal (range no-of-players)
        players-vertical (sort (partial compare-grid grid-width)
                               (range no-of-players))
        players (concat players-horizontal players-vertical)]
    (mapcat #(partition 2 1 %)
            (partition grid-width players))))
(def m (generate-matches 3))
m
Given the matches we can build a vector containing the neighbours of a particular player:
(defn generate-neighbours [matches]
  (reduce (fn [m d]
               (let [k (first d)
                     v (get m k [(first d)])]
                 (assoc m k (conj v (second d)))))
             []
             (mapcat #(vector % (reverse %)) matches)))
(def n (generate-neighbours m))
n
So for example the neighbours for player 4 would be:
(n 4)
Ok now given the strategies and last-decisions from the previous matches, we can calculate the next set of results:
(defn matches-results [player-strategies match-last-decisions no-of-matches-per-frame]
  (reduce
    (fn [results match]
      (let [[strategy-p1 strategy-p2] (vals (select-keys player-strategies match))
            [last-decision-p1 last-decision-p2] (match-last-decisions match)
            result (reduce (fn [[s1 _ _][s2 m1 m2]]
                             (vector (map + s1 s2) m1 m2))
                           (take no-of-matches-per-frame
                             (match-results strategy-p1 strategy-p2 last-decision-p1 last-decision-p2)))]
        (assoc results match result)))
    {}
    (keys match-last-decisions)))
(def frame-1
  (into []
        (take 9 (random-player-strategies {:always-cooperate 1
                                           :always-defect    1
                                           :tit-for-tat      2}))))
(def mr
  (matches-results frame-1
                   (zipmap m (repeat [nil nil]))
                   5))
mr
To get the final scores we will need to aggregate the results:
(defn get-scores [matches-results]
  (into []
    (vals
      (reduce-kv (fn [scores match result]
                    (merge-with + scores (zipmap match (first result))))
                 (sorted-map)
                 matches-results))))
(def s (get-scores mr))
s
We calculate the strategies for the next frame using the strategies and scores from the current frame and the neighbours vector:
(defn calculate-next-strategies [current-strategies scores neighbours]
  (into []
        (map-indexed (fn [i s]
	                     (let [n (neighbours i)
                             sn (select-keys scores n)
                             m (val (apply max-key val sn))
                             w (into {} (filter #(= (val %) m) sn))]
                         (if (w i)
                           s
                           (current-strategies (rand-nth (keys w))))))
             		 current-strategies)))
(def frame-2 (calculate-next-strategies frame-1 s n))
frame-2
Given any two frames with strategies we can find out which players have changed strategies:
(defn calculate-changed-strategies [previous-strategies next-strategies]
  (into #{}
    (filter #(not= (previous-strategies %) (next-strategies %))
            (range (count previous-strategies)))))
(calculate-changed-strategies frame-1 frame-2)
And when extracting the last-decisions from the matches-results we reset to [nil nil] those where the players changed strategies:
(defn get-last-decisions [matches-results changed-strategies]
  (reduce-kv (fn [scores match result]
               (let [last-decisions (if (some changed-strategies match)
                                      [nil nil]
                                      (rest result))]
                 (assoc scores match last-decisions)))
              {}
              matches-results))
Using the building blocks above we can build a lazy sequence where each element is a frame in the simulation:
(defn frame-seq
  ([grid-width no-of-plays strategy-distribution]
    (let [matches        (generate-matches grid-width)
          neighbours     (generate-neighbours matches)
          strategies     (->> (random-player-strategies strategy-distribution)
                              (take (* grid-width grid-width))
                              (into []))
          last-decisions (zipmap matches (repeat [nil nil]))]
      (lazy-seq
        (cons strategies (frame-seq neighbours strategies last-decisions no-of-plays)))))
  ([neighbours previous-strategies previous-last-decisions no-of-plays]
    (let [results             (matches-results previous-strategies previous-last-decisions no-of-plays)
          scores              (get-scores results)
          next-strategies     (calculate-next-strategies previous-strategies scores neighbours)
          changed-strategies  (calculate-changed-strategies previous-strategies next-strategies)
          next-last-decisions (get-last-decisions results changed-strategies)]
      (lazy-seq
        (cons next-strategies (frame-seq neighbours next-strategies next-last-decisions no-of-plays))))))
Now that we have defined the sequence it’s easy to render it:
(defn render-frames [strategy-frames canvas-id wait-msec no-of-frames]
  (let [canvas (js/document.getElementById canvas-id)]
    (async.macros/go
      (doseq [frame (take no-of-frames strategy-frames)]
        (.requestAnimationFrame js/window #(render-player-strategies! canvas frame))
        (async/<! (async/timeout wait-msec))))))
We just take no-of-frames frames from the sequence and for each one call requestAnimationFrame with a timeout of wait-msec between each frame. Feel free to experiment with the values below. Remember press Ctrl+Enter to rerender after changing something:
(def strategy-frames
  (frame-seq 15 5 {:always-cooperate 1
                   :always-defect    1
                   :tit-for-tat      2}))
(render-frames strategy-frames "canvas-walkthrough" 500 10)
The interactive code snippets in this article are powered by KLIPSE.