EFTA01199747.pdf
dataset_9 pdf 3.0 MB • Feb 3, 2026 • 29 pages
3
,,n,Cooperation and control in multiplayer social dilemmas b.;
65
4 " " Christian Hilbe' ", Bin Wub, Arne Traulsenb, and Martin A. Nowak" 66
5 67
0:7.6.9 °Program for Evolutionary Dynamics, Harvard University, Cambridge, MA 0213t °Department for Evolutionary Theory, Max Planck Institute for
6 Evolutionary Biology, 20306 Plan, Germany; and 'Department of Organismic and Evolutionary Biology and Department of Mathematics, Harvard University, 68
7 Cambridge, MA 02138 69
8 70
Edited by Joshua B. Plotkin, University of Pennsylvania, Philadelphia, PA. and accepted by the Editorial Board September 26, 2010 (received for review
9 April 30, 2010) 71
10 72
II
Direct reciprocity and conditional cooperation are important mecha- prevent free riders from taking over. Our results, however, are 73
nisms to prevent free riding in social dilemmas. However, in large not restricted to the space of ZD strategies. By extending the
12 techniques introduced by Press and Dyson (23) and Akin (27), we 74
groups, these mechanisms may become ineffective because they re-
13 quire single individuals to have a substantial influence on their peers. also derive exact conditions when generalized versions of Grim, Tit- 75
14 However, the recent discovery of zero-determinant strategies in the for-Tat, and Win-Stay Lase-Shift allow for stable cooperation. In 76
5 iterated prisoner's dilemma suggests that we may have underesti- this way, we find that most of the theoretical solutions for the re- 77
16 mated the degree of control that a single player can exert. Here, peated prisoner's dilemma can be directly transferred to repeated 78
17 we develop a theory for zero-determinant strategies for multiplayer dilemmas with an arbitrary number of involved players. 79
I8 social dilemmas, with any number of involved players. We distinguish In addition, we also propose two models to explore how indi- 80
several particularly interesting subclasses of strategies: fair strategies
viduals can further enhance their strategic options by coordinating
19 their play with others. To this end, we extend the notion of ZD 81
20 ensure that the own payoff matches the average payoff of the group; 82
strategies for single players to subgroups of players (to which we
21 extortionate strategies allow a player to perform above average;
refer as ZD alliances). We analyze two models of ZD alliances,
and generous strategies let a player perform below average. We use depending on the degree of coordination between the players.
22
this theory to descnbe strategies that sustain cooperation. induding When players form a strategy alliance, they only agree on the set
23
generalized variants of Tit-for-Tat and Win-Stay Lose-Shift. Moreover, of alliance members, and on a common strategy that each alliance
24 we explore two models that show how individuals can further enhance member independently applies during the repeated game. When
25 their strategic options by coordinating their play with others. Our players form a synchronized alliance, on the other hand, they
26 results highlight the importance of individual control and coordination agree to act as a single entity, with all alliance members playing the
27 to succeed in large groups same action in a given round. We show that the strategic power of
28 ZD alliances depends on the size of the alliance, the applied
29 evolutionary game theory I alliances I public goods game I strategy of the allies, and on the properties of the underlying social
30 volunteer's dilemma I cooperation dilemma. Surprisingly, the degree of coordination only plays a role
31 as alliances become large (in which case a synchronized alliance
has more strategic options than a strategy alliance).
32
33 C ooperation among self-interested individuals is generally
difficult to achieve (1-3), but typically the free rider problem
is aggravated even further when groups become large (4-9). In
To obtain these results, we consider a repeated social dilemma
betweenn players. In each round of the game, players can decide
34 whether to cooperate (C) or to defect (D). A player's payoff
small communities, cooperation can often be stabilized by forms depends on the player's own decision and on the decisions of all
35 97
of direct and indirect reciprocity (10-17). For large groups, how- other group members (Fig. 1A): in a group in which/ of the other
36 98
ever, it has been suggested that these mechanisms may turn out to group members cooperate, a cooperator receives the payoff al ,
37 be ineffective, as it becomes more difficult to keep track of the 99
whereas a defector obtains b,. We assume that payoffs satisfy the
38 reputation of others and because the individual influence on others 100
39 diminishes (4-8). To prevent the tragedy of the commons and to 101
Significance
40 compensate for the lack of individual control, many successful 102
41 communities have thus established central institutions that enforce 103
42 mutual cooperation (18-22). Many of the world's most pressing problems, like the prevention 1(14
However, a recent discovery suggests that we may have un- of climate change, have the form of a large-scale social dilemma
43 with numerous involved players. Previous results in evolutionary 105
44 derestimated the amount of control that single players can exert in 106
repeated games. For the repeated prisoners dilemma, Press and game theory suggest that multiplayer dilemmas make it partic-
45 Dyson (23) have shown the existence of zero-determinant strategies ularly difficult to achieve mutual cooperation because of the lack 107
46 (or ZD strategies), which allow a player to unilaterally enforce of individual control in large groups. Herein, we extend the 108
47 a linear relationship between the own payoff and the coplayer's theory of zero-determinant strategies to multiplayer games to 109
48 payoff, irrespective of the coplayer's actual strategy. The class of describe which strategies maintain cooperation. Moreover, we 110
49 zero-determinant strategies is surprisingly rich: for example, a player propose two simple models of alliances in multiplayer dilemmas. 111
50 who wants to ensure that the own payoff will always match the The effect of these alliances is determined by their size, the 112
5I coplayer's payoff can do so by applying a fair ZD strategy, like Tit- strategy of the allies, and the properties of the social dilemma. 113
52 for-Tat. On the other hand, a player who wants to outperform the When a single individual's strategic options are limited, forming
114
respective opponent can do so by slightly tweaking the Tit-for-Tat an alliance can result in a drastic leverage.
53 strategy to the own advantage, thereby giving rise to extortionate 115
54 ZD strategies. The discovery of such strategies has prompted sev- Author contrbutions: B.W. initiated the project; CN. B.W. AT, and M.A.N. designed 116
55 eral theoretical studies, exploring how different ZD strategies research; C.H.,B.W., AT., and MAN. performed research; At and MAN. analysed data 117
56 evolve under various evolutionary conditions (24-30). and C H and 6 W. wrote the paper. 118
57 ZD strategies are not confined to the repeated prisoner's di- The authors declare no conflict of interest. 119
58 lemma. Recently published studies have shown that ZD strate- 'Mks artkle Is a PNAS Direct Submission. J.B.P. is a guest editor Invited by the mow 120
gies also exist in other repeated two player games (29) or in Board.
59 Freely available online through the PNAS open access option.
121
repeated public goods games (31). Herein, we will show that such 122
strategies exist for all symmetric social dilemmas, with an arbi- 'To whom correspondence should be addressed. Email: nitiorlasriarraisethi.
61 trary number of participants. We use this theory to describe 'nth article contains supporting information online aUnww.pnas.orgiloalruWsupplidoi:10. 123
62 which ZD strategies can be used to enforce fair outcomes or to ion/linos motion imxsowirmental. 124
vnewpnas.orgrcglidoi/10.1073/pnas.1407887111 PNAS Early Edition I 1of 6
EFTA01199747
125 following three properties that are characteristic for social Results 187
126 dilemmas (corresponding to the individual-centered interpretation Memory-One Strategies and Akin's lemma ZD strategies are 188
127 of altruism in ref. 32): (i) irrespective of the own strategy, players memory-one strategies (23, 36); they only condition their behavior 189
128 prefer the other group members to cooperate (aft' ≥ aj and bp,. ≥ bj on the outcome of the previous round. Memory-one strategies can 190
129 for allj); (ii) within any mixed group, defectors obtain strictly higher be written as a vector p= (Pca Pc o.PnA-4 191
Poo)- The
130 payoffs than cooperators > aj for all j); and (iii) mutual co- entries ps) denote the probability to cooperate in the next round, 192
operation is favored over mutual defection (an _i> bo). To illustrate given that the player previously played S E {C. O} and that j of the
131 193
our results, we will discuss two particular examples of multiplayer coplayers cooperated (in the SI Tar, we present an extension in
132 games (Fig. 1B). In the first example, the public goods game (33), 194
which players additionally take into account who of the coplayers
133 cooperators contribute an amount c > 0 to a common pool. knowing 195
cooperated). A simple example of a memory-one strategy is the
134 that total contributions are multiplied by r (with 1 <r <it) and evenly 196
strategy Repeat. prim, which simply reiterates the own move of the
135 shared among all group members. Thus, a cooperator's payoff is previous round, pici.7 =1 and 147 =0. In addition, memory-one 197
136 a = rc (j + 1)/n — c, whereas defectors yield bj=rcj/n. In the second strategies need to specify a cooperation probability pa for the first 198
137 example, the volunteer's dilemma (34), at least one group member round. However, our results will often be independent of the initial 199
138 has to volunteer to bear a cost c> 0 in order for all group members play, and in that case we will drop Po. 200
139 to derive a benefit h>c. Therefore, cooperators obtain cif = b —c Let us consider a repeated game in which a focal player with 201
(irrespective ofj), whereas defectors yield hj = b ifj a 1 and bo =0. memory-one strategy p interacts with n —1 arbitrary coplayers
140 202
Both examples (and many more, such as the collective risk dilemma) (who are not restricted to any particular strategy). Let vs4(r)
141 (7, 8, 35) are simple instances of multiplayer social dilemmas. 203
142 denote the probability that the outcome of round t is (S,j). Let 204
We assume that the social dilemma is repeated, such that in-
v(0= (t) vao(t)] be the vector of these probabilities. A
143 dividuals can react to their coplayers' past actions (for simplicity, 205
limit distribution v is a limit point for a' —• co of the sequence
144 we will focus here on the case of an infinitely repeated game). As tv(1)+ +v(t)Wr. The entries vs, of such a limit distribution 206
145 usual, payoffs for the repeated game are defined as the average 207
correspond to the fraction of rounds in which the focal player
146 payoff that players obtain over all rounds. In general, strategies 208
finds herself in state (S.j) over the course of the game.
147 for such repeated games can become arbitrarily complex, as There is a surprisingly powerful relationship between a focal 209
subjects may condition their behavior on past events and on the
148 player's memory-one strategy and the resulting limit distribution 210
round number in nontrivial ways. Nevertheless, as in pairwise
149 of the iterated game. To show this relationship, let qc(r) be the 211
games, ZD strategies turn out to be surprisingly simple.
I50 probability that the focal player cooperates in round r. By definition 212
of pRiv we can write qc(r) = pRA° • v(1)=Evcs-i (0+ ... +vco(0).
I51 213
Similarly, we can express the probability that the focal player
152 214
A cooperates in the next round as qc(r + I) = p • v(t). It follows that
153 Number of cooperators qc(r +1)— qc(t)=(p— pRc") • v(t). Summing up over all rounds 215
154 a.1 1}.2 .... 2 1 0 216
among co-players from 1 to t, and dividing by t. yields (p — pR•17)• iv(I)+
155 v(r))/r= [qc(r+ I) —qc(1)1/r, which has absolute value at most 217
156 Cooperators payoff an-r arr.2 ... az as no IA By taking the limit r co we can conclude that 218
157 Defectors payoff bn-s bn-2 b2 br bo 219
158 (p —pRe0)•v=0. 220
159 221
B Volunteers Dilemma
160 This relation between a player's memory-one strategy and the 222
3 2 c. ‘. 4 ,
161 Detector resulting limit distribution will prove to be extremely useful. 223
2 b0.0. to 4, Because the importance of Eq. 1has been first highlighted by Akin
162 224
(27) in the context of the pairwise prisoner's dilemma, we will refer
1
163 g a l 225
to it as Akin's lemma. We note that Akin's lemma is remarkably
164 general, because it neither makes any assumptions on the specific 226
165 0 .—e—e—crathr
arb—C
game being played nor does it make any restrictions on the strat- 227
166 0 2 4 6 a 10 0 2 4 6 8 10 egies applied by the remaining n —1 group members. 228
Number et cocperabng co-64eyere Plumber of oocperalfrig co-players
167 229
168 C ZD Alliance Outsiders zero-Determinant Strategies in Multiplayer Social Dilemmas. As an 230
169 application of Akin's lemma, we will show in the following that 231
170 single players can gain an unexpected amount of control over 232
171 the resulting payoffs in a multiplayer social dilemma. To this 233
end, we first need to introduce some further notation. For
172 234
1 1 a focal player i, let us write the possible payoffs in a given round
173 as a vector g = (es), with g'n =a) and eDi =b). Similarly, let us 235
174 write the average payoffs ores coplayers as r= (gr), where 236
175 the entries are given by k g., =fra) + (n —j-1)br ibl(n — 1) and 237
176 gilo =frafri +(n —j —1)64/(n — I). Finally, let 1 denote the 2n- 238
177 Fig. 1. Illustration of the model assumptions for repeated soda! dilemma (A) dimensional vector with all entries being one. Using this notation, we 239
178 We consider symmetric n-player soda' dilemmas in which each player can either can write player Ps payoff in the repeated game as x' = g' • v, and the 240
179 cooperate or defect The players payoff depends on its own decision and on the average payoff of ts coplayers as = • v. Moreover, by defini- 241
number of other group members who decide to cooperate. (B) We will discuss tion of v as a limit distribution. it follows that I • v= 1. After these
ISO two particular examples: the public goods game (in which payoffs are pro- 242
preparations. let us assume player f applies the memory-one strategy
181 portional to the number of cooperators) and the volunteers dilemma (as the 243
182 most simple example of a nonlinear social dilemma). (C) In adcfrtion to individual 244
strategies, we will also explore how subjects can enhance their strategic options P= +4 +//C+71. [2]
183 245
184 by coordinating their play with other group members. We refer to the members 246
of such a ID alliance as allies, and we call group member that are not part of with a, p, and y being parameters that can be chosen by player i
185 the 2D alliance outsiders. Outsiders are not restricted to any particular strategy. (with the only restriction that p#0). Due to Akin's lemma, we 247
186 Some or all of the outsiders may even form their own alliance. can conclude that such a player enforces the relationship 248
2 of 6 I www.pnes.orgfcgildoi/10.1073Mnas.1407837III Hilbe et al.
EFTA01199748
249 311
250 0 = (p - pile/ • v = (cre +fie +71)v =ad +fir' +y. 131 pTFTs- = 312
n —I [71
251 313
252 Player i's strategy thus guarantees that the resulting payoffs of 314
the repeated game obey a linear relationship, irrespective of how For pairwise games, this definition ofpTFT simplifies to Tit-for-
253 Tat, which is a fair ZD strategy (23). However, also for the public 315
254 the other group members play. Moreover, by appropriately
choosing the parameters a, ft, and y, the player has direct control goods game and for the volunteer's dilemma, pTFT is a ZD 316
255 on the form of this payoff relation. As in Press and Dyson (23), strategy, because it can be obtained from Eq. 4 by setting s= 1 317
256 who were first to discover such strategies for the prisoner's di- and ¢=1/c, with c being the cost of cooperation. 318
257 lemma, we refer to the memory-one strategies in Eq. 2 as zero- As another interesting subclass of ZD strategies, let us con- 319
258 determinant strategies or ZD strategies. sider strategies that choose the mutual defection payoff as 320
For our purpose, it will be convenient to proceed with baseline payoff,1=60, and that enforce a positive slope 0 <s < 1.
259 The enforced payoff relation 5 becomes se"' =sx' + (1 —s)bo, im- 321
260 a slightly different representation of ZD strategies. Using the 322
plying that on average the other group members only get
261 parameter transformation 1=-71(a+ fi), s = —alfi, and ¢=—p, a fraction s of any surplus over the mutual defection payoff. 323
262 ZD strategies take the form Moreover, as the slope s is positive, the payoffs x' and le are 324
263 positively related. As a consequence, the collective best reply for 325
p= + OKI -s)(!1-gi) + — [4] the remaining group members is to maximize i's payoffs by
264 326
265 cooperating in every round. In analogy to Press and Dyson (23), 327
and the enforced payoff relationship according to Eq. 3 becomes we call such ZD strategies extortionate, and we call the quantity
266 x= 1/s the extortion factor. For games in which 1=14=0, Eq. 5 328
267 e 1 =ski +(i -s)1. shows that the extortion factor can be written as x = Je/x-I. Large 329
268 extortion factors thus signal a substantial inequality in favor of 330
269 We refer to1as the baseline payoff of the ZD strategy and to s as player i. Extortionate strategies are particularly powerful in so-
270 the strategy's slope. Both parameters allow an intuitive interpre- cial dilemmas in which mutual defection leads to the lowest
271 tation: when all players adopt the same ZD strategy p such that group payoff (as in the public goods game and in the volunteer's
x' =x-', it follows from Eq. 5 that each player yields the payoff 1. dilemma). In that case, they enforce the relation Ki > cc; on
272 average, player i performs at least as well as the other group
273 The value of s determines how the mean payoff of the other
group members e' varies with d. The parameter 0 does not members (as also depicted in Fig. 2B). As an example, let us
274 consider a public goods game and a Z1D strategy pEr with 1=0,
275 have a direct effect on Eq. 5: however, the magnitude of ¢ de- =nIK" —r)sc+rcl. for which Eq. 4 implies
termines how fast payoffs converge to this linear payoff relation-
276 ship as the repeated game proceeds (37).
277 - I [i (1 scir+t -Ir)si.
278
279
280
The parameters 1. s. and efr of a ZD strategy cannot be chosen
arbitrarily, because the entries psi are probabilities that need to
satisfy 0 <psi < 1. In general, the admissible parameters depend
on the specific social dilemma being played. In SI Tier, we show
n- I
independent of the players own move Se {C.D}. In the limit
s I, pa approaches the fair strategypTFT. Ass decreases from
Ig]
1
281 that exactly those relations 5 can be enforced for which either
282 s= 1 (in which case the parameter 1 in the definition of ZD 1, the cooperation probabilities of ey are increasingly biased to
the own advantage. Extortionate strategies exist for all social
283 strategies becomes irrelevant) or for which / and s < 1 satisfy dilemmas (this follows from condition [6] by setting 1=b0 and 345
284 choosing an s close to 1). However, larger groups make extor- 346
285 j —a-_1 +n —j— I b 41.1 347
max fb. </< mM tion more difficult. For example, in public goods games with
286 osisx-i n— 1 —s osisn-i n—1 —s n > r fir — I), players cannot be arbitrarily extortionate any longer 348
287 161 as [6] implies that there is an upper bound on,' (SI Tea). 349
288 As the benevolent counterpart to extortioners, Stewart and 350
289 It follows that feasible baseline payoffs are bounded by the payoffs Plotkin described a set of generous strategies for the iterated 351
290 for mutual cooperation and mutual defection,b0 s! <a„_i, and that prisoner's dilemma (24, 28). Generous players set the baseline 352
the slope needs to satisfy —1/(n — 1) <s < 1. With s sufficiently payoff to the mutual cooperation payoff I= a„._i while still
291 enforcing a positive slope 0<s < 1. This results in the payoff con 353
292 close to 1, any baseline payoff between bo and a„_ can be achieved.
Moreover, because the conditions in Eq. 6 become increasingly re- relation c' =so' + (1 —5)0„_1. In particular, for games in which 354
293 strictive as the group size n increases, larger groups make it more mutual cooperation is the optimal outcome for the group (as in 355
294 difficult for players to enforce specific payoff relationships. the public goods game and in the prisoner's dilemma but not in 356
295 the volunteer's dilemma), the payoff of a generous player sat- 357
296 Important Examples of ZD Strategist In the following, we discuss isfies x' < (Fig. 2C). For the example of a public goods game, 358
297 some examples of ZD strategies. At first, let us consider a player we obtain a generous ZD strategy p' by setting 1=rc—c and 359
who sets the slope to s = 1. By Eq. 5, such a player enforces the ep —r)sc +rc], such that
298 360
299 payoff relation x' =x', such that's payoff matches the average Ge I „ ti 1 n(r —1) 361
300 payoff of the other group members. We call such ZD strategies Ps Ig1 362
fair. As shown in Fig. 24, fair strategies do not ensure that all n +(I s) n —I r+(n—r)s.
301 363
302 group members get the same payoff; due to our definition of 364
social dilemmas, unconditional defectors always outperform For s 1, ptt approaches the fair strategy pTFT, whereas lower
303 values of s make pcw more cooperative. Again, such generous 365
unconditional cooperators, no matter whether the group also
304 strategies exist for all social dilemmas, but the extent to which 366
contains fair players. Instead, fair players can only ensure that players can be generous depends on the particular social di-
305 they do not take any unilateral advantage of their peers. Our 367
306 lemma and on the size of the group. 368
characterization 6 implies that all social dilemmas permit a As a last interesting class of ZD strategies, let us consider players
307 player to be fair, irrespective of the group size. As an example, 369
who chooses =0. By Eq. 5, such players enforce the payoff relation
308 consider the strategy proportional lit -for-Tat (pTFT), for which x'=1, meaning that they have unilateral control over the mean 370
309 the probability to cooperate is simply given by the fraction of payoff of the other group members (for the prisoner's dilemma, 371
310 cooperators among the coplayers in the previous round such equalizer strategies were first discovered in ref. 38). However, 372
Hilbe et el. PNAS lefty Edition I 3 o16
EFTA01199749
373 Fig. 2. Characteristic dynamics of payoffs over the 435
374 A FEW strategy B ExtallCilete Strategy C Gene/00$ SeategY 436
course of the game for three different 2D strategies. 2.5 2_5 2.5
375 Each panel depicts the payoff of the focal player fr'
Other
437
(blue) and the average payoff of the other group Focal Focal
376 paysasi
c 01aYOr * members
9v 438
members g- ' (red) by thick lines. Additionally, the 13,
377 individual payoffs of the other group members are gists . 439
. Mee 'Sc
378 shown as thin red lines. (A) A fair player ensures as OretiOMeoters Focal 440
°0e(
379 that the own payoff matches the mean payoff of ,I gray marrtors Fiery 441
380 Q:26 the other group members. This does not imply that o.s OA 0. 442
all other groupmembers yield the same payoff. (8) D 20 40 60 DO IDO 0 20 40 BO BO 100 D 20 40 60 HO IGO
381 For games in which mutual defection leads to the
Flotnd Number Round Round Number 443
382 lowest group payoff, extortionate players ensure that their payoffs are above average. (C) In games in which mutual cooperation is the social optimum, 444
383 generous players let their coplayers gain higher payoffs. The three graphs depict the case of a public goods game with r =4, c =1, and group size n=20. For 445
384 the strategies of the other group members, we used random memory-one strategies, where the cooperation probabilities were independently drawn from 446
a uniform distribution. For the strategies of the focal player, we used (A)pift (8) p” with s =0.8, and (C) pc* with s =0.8.
385 447
386 448
387 unlike extortionate and generous strategies, equalizer strategies cooperation and after mutual defection lie., Pea-t =Pao = 1 and 449
388 typically cease to exist once the group size exceeds a critical psi =0 for all other states (S.j)]. We refer to such a behavior as 450
389 threshold. For the example of a public goods game this thresh- WSLS, because for painvise dilemmas it corresponds to the Win- 451
390 old is given by n =211(r — 1). For larger groups, single players Stay, Lose-Shift strategy described by ref. 36. Because of condition 452
391 cannot determine the mean payoff of their peers any longer. 1101, WSLS is a Nash equilibrium if and only if the social dilemma 453
392 satisfies (6„_1 +b0)12sa„_i. For the example of a public goods 454
Stable Cooperation In Multipbyer Sodal Dilemmas. Let us next ex- game, this condition simplifies to r> 2n1(n + 1). which is always
393 plore which ZU strategies give rise to a Nash equilibrium with 455
394 fulfilled for r≥ 2. For social dilemmas that meet this condition, 456
stable cooperation. In S/ Text, we prove that such ZD strategies WSLS provides a stable route to cooperation that is robust to errors.
395 need to have two properties: they need to be generous (by setting 457
396 / =a„_i and s> 0), but they must not be too generous [the slope Zero-Determinant Alliances. In agreement with most of the theo- 458
397 needs to satisfy s ≥ (n — 2)/(n —1)1. In particular, whereas in the retical literature on repeated social dilemmas, our previous 459
398 repeated prisoner's dilemma any generous strategy with s> 0 is analysis is based on the assumption that individuals act in- 460
399 a Nash equilibrium (27, 28), larger group sizes make it increasingly dependently. As a result, we observed that a player's strategic 461
400 difficult to uphold cooperation. In the limit of infinitely large options typically diminish with group size. As a countermeasure, 462
401 groups, it follows that s needs to approach 1, suggesting that ZD subjects may try to gain st
Entities
0 total entities mentioned
No entities found in this document
Document Metadata
- Document ID
- 4df86efb-107f-45a5-b6e0-3df8f546663f
- Storage Key
- dataset_9/EFTA01199747.pdf
- Content Hash
- 3abdfc4475115036f4a05d3a3b1e512e
- Created
- Feb 3, 2026