Ethics Decision Theory

The Prisoner's Dilemma

My Improvements

The 5 Scenarios

Ethics Within a Group


Jschreiner Home

g1 g2 g3 g4

Some Prisoner's Dilemma (PD) Terminology

Reward - The gain made by an Agent from mutual Cooperation.

Punishment - The loss suffered by an Agent for mutual Defection.

Temptation - The greater gain to be accrued for Defecting when the Other Cooperates.

Sucker's Payoff- The greater loss to be suffered for Cooperating when the Other Defects.

In mathematically defining these four variables, the letters P (Punishment), R (Reward), S (Sucker's Payoff), and T (Temptation) are often used.

Defect Unconditionally (DU) - The strategy of always Defecting in any encounter.  This is also known as the Iron Rule.

Cooperate Unconditionally (CU) - The strategy of always Cooperating in any encounter.  This is also known as the Golden Rule.

Iterated Prisoner's Dilemma (IPD) - Instead of just one encounter, the two Agents have multiple, sequential encounters.

Finite IPD - The two Agents have a finite number of known encounters with each other.

Infinite IPD - The two Agents will have encounters forever.

Indefinite IPD - The two Agents will have a finite number of encounters, but neither knows how many encounters this will be.

Shadow of the Future - A metaphor for describing the effect of IPD.  In making a decision at the current encounter, the Agent must consider its effect on all future encounters with Other.  A long Shadow of the Future means that there will be many more encounters, so future encounters greatly influence the current decision.  A short Shadow means few or no future encounters, so that only the immediate consequences are considered.  The Shadow of the Future is generally a motivation to Cooperate.

Zero-Sum - A game where the gain of one Agent is the equivalent loss of the other Agent.  The total gain of both Agents is always zero.

Retaliation - In the previous encounter, Agent A Cooperated while Agent B Defected.  In the next encounter, Agent A decides to Retaliate against Agent B by Defecting.

Forgiving - In the previous encounter, Agent A Cooperated with Agent B.  Agent B will choose to Cooperate with Agent A in the next encounter because of this, even though Agent A may have made previous Defections.

Nice - A strategy is Nice if it is never first to Defect.

Asynchronous - In traditional PD, the two Agents must make their choices simultaneously.  In many real-world situations, however, the choices are made sequentially.  Agent A will make the first choice.  Agent B will make his/her decision based on the knowledge of Agent A's choice.

Anticipation - In making a decision within an Asynchronous PD, the Agent tries to anticipate the subsequent choice of Other.  Anticipation is usually a motivation toward Defection in this circumstance.

Farmer's Dilemma - An asynchronous form of PD, taken from an example by philosopher David Hume.  Two farmers are ready to harvest their respective crops, but neither can adequately harvest his own crop alone.  If both farmers Cooperate, Farmer B can help Farmer A harvest his (A's) crop first, then Farmer A will immediately reciprocate by helping Farmer B harvest B's crop.  But because of their mutual suspicion of Defection regarding the second harvest, neither farmer assists the other, and both suffer loss to their crops.

Traveler's Dilemma (or the Rover) - In a closed population of Agents, an Agent who DU's may be eventually punished by the others.  A solution for the DU Agent is to continually travel to other populations, taking advantage of their Cooperation until they begin punishing him/her.  Conversely, for the population, do they choose to Cooperate or Defect with a newcomer, not knowing his/her history?

Dollar Auction - A dollar is auctioned to a group of Agents.  The rules are 1) the highest bidder gets the dollar, but 2) the second-highest bidder must also pay his/her bid, receiving nothing.  This seemingly benign game turns ugly once the bidding reaches 100 cents.  In an effort to avoid paying his/her bid, the second-highest bidder now makes a bid greater than 100 cents, more than the dollar is actually worth.  And from that moment, the bidding spirals to unreasonable heights, as Agents are motivated to minimize their loss, rather than maximize their gain.

Tragedy of the Commons - A PD, except that there are many Agents.  The rational decision is for each Agent to Defect, leaving all the poorer.  Again, this is based on a farming parable.  Several farmers each have a herd of cattle.  Their herds graze on common land, open to all.  The common land can only support a finite number of cattle.  If each farmer limits the size of his herd (Cooperates), then there will be sufficient land for all farmers.  But the temptation is to Defect.  By increasing his herd, a farmer will increase his profit when he sells the cattle.  But he does so at the expense of the other farmers, whose cattle will be malnourished, or die of starvation.  Rationally, each farmer will try to increase his herd beyond the capacity of the common land.  Yet, if all Cooperated, there would be sufficient land for a limited number of cattle.  Instead, all cattle starve for lack of food.

Tit-For-Tat (TFT) - One of the more successful strategies in IPD.  It consists of 1) Cooperate the first round, 2) Duplicate the previous Cooperate/Defect decision of the Other Agent in the subsequent round.  This strategy rewards Cooperation, and punishes Defection.  TFT is the standard against which other strategies are compared.

Imperfect TFT (ITFT) - A small degree of random error (1% - 5%) is introduced into the TFT strategy.  1% to 5% of the time, the Agent will make the choice opposite of that dictated by the deterministic strategy of TFT.

Generous TFT (GTFT) - TFT, but more likely to forgive Defections.

Suspicious TFT (STFT) - Like TFT, except the first move is to Defect.

Tit-For-Two-Tats (TF2T) - A form of GTFT.  The Agent will not retaliate against the Other until it has experienced 2 sequential Defections.

GRIM (sometimes called TRIGGER) - A strategy of Cooperating until the Other Agent Defects.  After that moment, the strategy will be DU at every subsequent encounter.

Reciprocal Cooperation - A strategy of: If the Other Agent will Cooperate if you Cooperate, and will Defect if you do not Cooperate, then Cooperate.  Otherwise, Defect.  This is a version of Anticipation in Asynchronous PD.

Reciprocal Altruism - Altruism limited to exchanges between just two Agents.  It is relatively easy to demonstrate how 2 Agents could show each other mutual Cooperation or Altruism.  It is much harder to demonstrate how general Altruism could come into being.

Pavlovian - Named after the discoverer of classical conditioning, this label is used in several, somewhat contradictory, contexts.  Originally, it meant an entire class of strategies, each strategy being a "reflexive" decision made on the basis of the previous encounter.  Where C=Cooperation and D=Defection, each encounter has four possible outcomes (CC, CD, DC, DD).  A strategy chooses to Cooperate or Defect after each of these 4 possibilities.  A strategy of (1,1,1,1) would be equivalent to CU.  A strategy of (0,0,0,0) is equivalent to DU.  A strategy of (1,0,1,0) is equivalent to TFT.  Thus, after 1 encounter, there is a class of 16 possible strategies, which can be defined as a vector with 4 terms.  This is a class of level 1 Pavlovian strategies.

But a decision strategy may be based on the outcomes of the previous two encounters, not just one.  This would be the class of level 2 Pavolvian strategies.  Pavlovian decision strategies can be based on 3, 4, or more levels.  Naturally, this gets quite complicated.

In another context, the label Pavolvian may be used to describe a particular level 1 strategy, a strategy that was found to be oddly successful in IPD.

Free Rider - A large population of Agents may have a collective strategy of CU.  An agent among them, the Free Rider, chooses to Defect to take advantage of their Cooperation.  The Free Rider enjoys free gain from the burden of Others.

Scorpion - An Agent who makes a drastic Defection, one so extreme that it exceeds the normal boundaries of the PD.  The defection is so great that it brings great loss to the Defector himself, or ends the possibility of all future encounters (killing the Other).  The label is based on a fable.  A frog and a scorpion wish to cross a stream, where a stork lurks at the other side.  The frog is afraid of being eaten by the stork, and the scorpion cannot swim across the stream.  The scorpion asks the frog to carry him across the river.  The scorpion can frighten the stork, preventing it from eating the frog.  The frog asks the scorpion how he can be trusted not to sting him while they cross the stream.  The scorpion replies that he himself would die if he stung the frog while crossing.  So the frog agrees to ferry the scorpion to the other side.  Yet, despite his promise, against his own self-interest, acting out of unreasonable instinct, the scorpion stings the frog, killing both.

Weak Altruism - Altruistic action taken by Agent A (to provide gain to Agent B) where Agent A suffers no loss.

Strong Altruism - Altruistic action taken by Agent A, on behalf of Agent B, where Agent A suffers actual loss.

Discount Factor - In an IPD, where the possible gain/loss of future encounters is taken less seriously (discounted) than the gain/loss of the immediate encounter.

Evolution - The population of Agents cannot be forever static.  Some strategies will be more successful than others.  In the real world, this means that the proportion of successful strategies will increase, while those of the unsuccessful will decrease.  As a strategy's opponents/partners change, it will change in its success.

Alexrod's 4 Maxims - Based on his original research, Robert Axelrod proposed 4 criteria on which viable decision strategies should be based.  By following these criteria, the strategy should be effective in IPD.  1) Be Nice, do not be the first to Defect.  2) Defect against previous Defection (Retaliate), but Cooperate (Forgive) when Other begins to Cooperate.  3) Eschew envy (do not Defect if Other has accumulated more than you).  4) Make certain Other knows how you will respond.