Wednesday, July 10, 2019

Fsharp code kata: Prisoner Dilemma

This blog is about the Prisoner Dilemma in F#

(EDIT: something changed  in the code so the description given in this post may be not 100% accurate. Some comments in the source code can give more info about)

(github code: https://github.com/tonyx/PrisonerDilemmaFsharp)

A quick look at the prisoner dilemma description in wikipedia.

This code is about:
defining the possible strategies
defining the player types
creating an iterated prisoner dilemma tournament
playing a tournament.
computing the scores in a tournament.

Let's start with the description and the code:

1) Each move can be: Cooperates, or Defect
type Move = Cooperate | Defect
view raw type.fs hosted with ❤ by GitHub


2) the result of a single game between two players is this record
type JointMove = {Player1Move: Move; Player2Move: Move}
view raw JointMove.fs hosted with ❤ by GitHub

Let's also define the mutual score of a jointmove.

3) given a JointMove, we get a score determined by both the players moves:
type JointScore = {Player1Score: int; Player2Score: int}
view raw JointScore.fs hosted with ❤ by GitHub
To determine the JointScore, given a JointMove we have the following function

let jointScore jointMove =
match jointMove with
| {Player1Move= Cooperate ; Player2Move=Cooperate} -> {Player1Score=4; Player2Score=4 }
| {Player1Move= Cooperate ; Player2Move=Defect} -> {Player1Score=2; Player2Score=5}
| {Player1Move= Defect ; Player2Move=Cooperate} -> {Player1Score=5; Player2Score=2}
| {Player1Move= Defect ; Player2Move=Defect} -> {Player1Score=3; Player2Score=3}
view raw jointScore.fs hosted with ❤ by GitHub
I want to have a version of the previous function that allows me to "parametrize" the scores, and chack the  P>R>T>S rule at compile time

let jointScore jointMove =
let R = 4 // my outcome if I cooperate and the other cooperates
let S = 2 // my outcome if I cooperate and the other defects
let T = 5 // my outcome if I defect ant the other cooperates
let P = 3 // my outcome if I defect ant the other defects
let checkPDConditions = (T > R) & (R > P) & (P > S)
let _ = if ( not checkPDConditions) then failwith "this is not a pd game: violated pd conditions: T > R > P > S"
match jointMove with
| {Player1Move= Cooperate ; Player2Move=Cooperate} -> {Player1Score=R; Player2Score=R }
| {Player1Move= Cooperate ; Player2Move=Defect} -> {Player1Score=S; Player2Score=T }
| {Player1Move= Defect ; Player2Move=Cooperate} -> {Player1Score=T; Player2Score=S}
| {Player1Move= Defect ; Player2Move=Defect} -> {Player1Score=P; Player2Score=P}
view raw jointScore2.fs hosted with ❤ by GitHub



In the "iterated" version, two players play repeatedly. each player remembers the history of the jointMoves and can use it to decide the next move.

So a player adopts a Strategy, which is a function from jointMoves list to a Move:

type Strategy = JointMove list -> Move

(EDIT: strategy changed ad follows:

type Strategy = Move list -> Move list -≷ Move)

We may also want to label a strategy with a name, so we define a type of record called StrategyInfo:

type StrategyInfo = {Name: string; Strategy:Strategy}

We want to give to each player a name and a StrategyInfo (which means a strategy with a name):
type Player = {Name: string; StrategyInfo: StrategyInfo}

Any game, at a certain state, involves two players, and a list of jointMove that they made:

type Game = { Player1:Player; Player2:Player; JointmoveHistory: JointMove list }

A "tick" is the joint play of both player in a game. The result will be the same game with their joint moves added to the history of moves:
let tick (game:Game) =
let playerOneMove = game.Player1.StrategyInfo.Strategy game.JointmoveHistory
let playerTwoMove = game.Player2.StrategyInfo.Strategy game.JointmoveHistory
{game with JointmoveHistory = {Player1Move=playerOneMove;Player2Move=playerTwoMove}::game.JointmoveHistory }
view raw tick.fs hosted with ❤ by GitHub




A function for "ticking" a game n time will be useful:

let nTicks (game: Game) n =
[1 .. n ] |> List.fold (fun game _ -> game |> tick ) game
view raw tick.fs hosted with ❤ by GitHub

Now, let's try some simple cases, not before defining the naive strategy, the cooperator:

let (cooperatorStrategy:Strategy) =
     fun _ ->
          Cooperate

We want two create a cooperatorstrategyInfo, and two cooperators players, which use the same strategyInfo:


let cooperatorStrategyInfo = {Name="cooperatorStrategy";Strategy=cooperatorStrategy}
let cooperator1 = {Name = "cooperator1";StrategyInfo= cooperatorStrategyInfo}
let cooperator2 = {Name = "cooperator2";StrategyInfo= cooperatorStrategyInfo}

let's define a new game:
let aGame = {Player1=cooperator1;Player2=cooperator2;JointmoveHistory=[]}

let's tick it one time:

game |> tick

the result is a game with a one item in the history of moves.
let't "3 ticks" the game, using the nTicks function:

nTick game 3

the resulting game has a history of moves longest 3.

We can't do so much with a single game.
We need a way to define a tournament, which is based on a list of games, and a number representing the "ticks for game"


(edit July 11 - 2019: in the new version I dropped the tornment type, and always use the games, by playing a game for n "ticks" using the function playGameNTimes)

type Tournment = {Games:Game list; TicksForGame: int}

type Tournment = {Games:Game list; TicksForGame: int}
let makeTournment players ticksForGame =
let games = (players |> List.map (fun x -> ((players |> List.filter (fun z -> z.Name <> x.Name)) |>
(List.map (fun y -> {Player1=x;Player2=y;JointmoveHistory=[]} ))))) |> List.fold (@) []
{Games = games;TicksForGame=ticksForGame}
view raw tournment.fs hosted with ❤ by GitHub


What the previous code actually means is, described in a procedural way:
for each player, get any player which is different from the current player, and create a game of them. The result will be "flattened" by the List(@) [] operator.
The resulting operation will be the game, and the function will return a structure given by the games and the ticks for each game.
This will ensure that for each pair of players there will be a game with it as the first and second player (and vice-versa).

To play a tournament, we just have to tick each game:
let playTournment (tournment:Tournment) =
tournment.Games |> List.map (fun x -> nTicks x tournment.TicksForGame)
Given a game, we may want to get the total score of the game (given by cumulating all the scores):

let gameScores game =
game.JointmoveHistory |> List.fold (fun acc x ->
let jo = jointScore x;
{Player1Score=jo.Player1Score+acc.Player1Score;
Player2Score=jo.Player2Score+acc.Player2Score
})
{Player1Score=0;Player2Score=0}
let gamesScores games =
games |> List.map (fun (x:Game) -> (x.Player1,x.Player2,gameScores x))
view raw gamescore.fs hosted with ❤ by GitHub

For a list of games, we want to get all the players and score for game, and put them in triplets (player1, player2, jointScore):





We want to compute the score in a set of games for a single player.
let scoreForPlayer games player =
let ot= gamesScores games
let gamesWherePlayerIsFirst = ot |> List.filter (fun (player1,_,_) -> player1.Name = player.Name)
let gamesWherePlayerIsSecond = ot |> List.filter (fun (_,player2,_) -> player2.Name = player.Name)
let scoresOfPlayerAsFirst = gamesWherePlayerIsFirst |> List.sumBy (fun (_,_,x) -> x.Player1Score)
let scoresOfPlayerAsSecond = gamesWherePlayerIsSecond |> List.sumBy (fun (_,_,x) -> x.Player2Score)
scoresOfPlayerAsFirst+scoresOfPlayerAsSecond

And we want also the scores for all the players.
let playersOfGames (games:Game list) =
games |> List.map (fun x -> [x.Player1;x.Player2]) |> List.fold (@) [] |> Set.ofList |> Set.toList
let allPlayersScore games =
let players = playersOfGames games
players |> List.map (fun x -> (x,scoreForPlayer games x))
(it looks redundant having a way to re-extract players that we already used from the beginning, but for now I found easier to get back players from the games, also because in future "evolutionary" versions of the game players may mutate).


Now suppose that we will start a tournment between four players:
a cooperator, a defector, a random, and a titForTat
I define a strategyInfo for each of them:

    let cooperatorStrategyInfo = {Name="cooperatorStrategy";Strategy=cooperatorStrategy}
    let defectorStrategyInfo = {Name="defectorStrategy"; Strategy=defectorStrategy}
    let randomStrategyInfo = {Name="randomStrategy";Strategy=randomStrategy}
    let titForTatStrategyInfo = {Name="titForTatStrategy";Strategy=titForTatStrategy}

To define a single player, like a cooperator, we may  say:
    let cooperatorPlayer = {Name="cooperator";StrategyInfo=cooperatorStrategyInfo}

But we rather want to have a chunk of cooperators, sharing the same strategy, so we have the following function:



let makeNPlayersByStrategyInfo (strategyInfo:StrategyInfo) n =
[0 .. n-1] |> List.map (fun x -> {Name=(strategyInfo.Name + "_"+(string)x);StrategyInfo=strategyInfo})


Now suppose we want to make a tornment between 5 cooperators, 5 titForTat, 5 defectors, and 5 randoms:

let cooperatorPlayers = makeNPlayersByStrategyInfo cooperatorStrategyInfo 5
let titForTatPlayers = makeNPlayersByStrategyInfo titForTatStrategyInfo 5
let randomPlayers = makeNPlayersByStrategyInfo randomStrategyInfo 5
let defectorPlayers = makeNPlayersByStrategyInfo defectorStrategyInfo 5

We put all players in a single list:

let allPlayers = cooperatorPlayers@titForTatPlayers@randomPlayers@defectorPlayers

we create a tournment, making all players play against each other, with 10 ticks for game:

let tournment = makeTournment allPlayers 10

we run the tournment:

let playedGames = playTournment tournment

Now we can see all the scores, sorted from the lowest:

sortedPlayersScore playedGames;;

this is the sketch of the result, showing that in this kind of tournament the defector is generally stronger.
This is just a particular case, where the defectors win as long as there are many cooperators to spoil. When there are many defectors (or free riders) they will perform lower because a game between defectors will end up in a lower total outcome, compared to, say cooperator vs cooperator (or titForTat vs titForTat).
Analyzing the prisoner dilemma theory is beyond the goal of this post.

val it : (Player * int) list =
  [({Name = "cooperatorStrategy_4";
     StrategyInfo = {Name = "cooperatorStrategy";
                     Strategy = ;};}, 1208);
   ({Name = "cooperatorStrategy_2";
     StrategyInfo = {Name = "cooperatorStrategy";
                     Strategy = ;};}, 1210);
   ({Name = "cooperatorStrategy_0";
     StrategyInfo = {Name = "cooperatorStrategy";
                     Strategy = ;};}, 1216);
...
                     Strategy = ;};}, 1536);
   ({Name = "defectorStrategy_3";
     StrategyInfo = {Name = "defectorStrategy";
                     Strategy = ;};}, 1540);
   ({Name = "defectorStrategy_4";
     StrategyInfo = {Name = "defectorStrategy";
                     Strategy = ;};}, 1554);
   ({Name = "defectorStrategy_0";
     StrategyInfo = {Name = "defectorStrategy";
                     Strategy = ;};}, 1566);
   ({Name = "defectorStrategy_2";
     StrategyInfo = {Name = "defectorStrategy";
                     Strategy = ;};}, 1568)]


Future development:
- implementing evolution
- implementing real time chart visualization


No comments: