Introduction

Multi-strategy evolutionary dynamics in nature often lead to diverse and complex phenomena, such as cyclic dominance that is captured by the well-known rock-paper-scissors game1. Experimental evidence from diverse contexts, ranging from the three-morph mating system of the side-blotched lizard2 and Escherichia coli populations3, to human economic behaviors4, demonstrates the occurrence of the rock-paper-scissors cycle in various real-world scenarios. Theoretical models of the rock-paper-scissors cycle have been explored in both two-player5 and multiplayer game frameworks6, contributing to an understanding of its underlying properties—as a consequence of strategy diversity, the intransitive interaction may emerge spontaneously. This phenomenon can be illustrated when we extend the basic two-strategy model of the evolution of cooperation by adding additional strategies that punish defectors7,8 or reward cooperators9,10. The additional strategies are necessary when considering more realistic models, which underlines the importance of a multi-strategy approach.

Previous research in evolutionary dynamics primarily focused on two-strategy systems, where the unconditional cooperator and defector strategies represent the fundamental conflict of individual and collective interests11. While cooperation can maximize mutual benefits, defection, despite offering higher personal payoff, reduces overall benefits to others. Consequently, defection often appears as the dominant strategy. An escape route from this dilemma could be a spatially structured population12,13, where individuals interact with fixed neighbors but still adopt the strategies of those with higher payoffs. This setting allows cooperation to form clusters, utilizing the advantage of collective payoffs thus resisting the invasion of defection, a concept known as spatial reciprocity14. It is recognized that no simple closed-form solution exists for general evolutionary dynamics in structured populations, unless by chance P = NP, Polynomial time equals to Nondeterministic Polynomial time15. However, in the weak selection limit, where the influence of the game on strategy updates is marginal, analytical solutions have been obtained from infinite16 to finite populations17, and from regular18,19 to arbitrary graphs20,21,22,23. This line of research has led to the development of evolutionary graph theory14.

In evolutionary graph theory, a widely used mathematical technique is the pair approximation24,25,26,27,28. This method applies to infinite populations on regular graphs and has revealed the well-known ‘b/c > k’ rule, which states that evolution favors cooperation when the benefit-to-cost ratio exceeds the number of neighbors29. Pair approximation is also capable of analyzing more complex models, including unequal interaction and dispersal graphs30, asymmetric networks31, and stochastic games32, predicting simulation outcomes with high accuracy. Notably, pair approximation has been applied to multi-strategy two-player games33, leading to the replicator equations for arbitrary n-strategy two-player games on a regular graph, as an important extension of the traditional replicator equations used in well-mixed populations34.

Unlike two-player games, multiplayer games exhibit much greater complexity, primarily due to their potentially nonlinear payoff functions35. In a structured population, multiplayer games require each individual to organize a game within their neighbors and themselves, which implies that individuals participate in games organized by both themselves and their neighbors, thereby interacting with second-order neighbors. Such interactions lead to higher-order interactions36,37, which cannot be simply reduced to a superposition of pairwise interactions. The complexity of multiplayer games can also be illustrated from the perspective of structure coefficients on graphs: a two-strategy two-player game needs only one structure coefficient38, a multi-strategy two-player game requires three39,40, but a two-strategy (k + 1)-player game needs as many as k structure coefficients41,42. The number of potential equilibrium points in general multiplayer games also indicates their complexity43,44. Even so, in the absence of triangle motifs, two-strategy multiplayer games can still be theoretically analyzed using pair approximation45,46, whose results are consistent with predictions obtained by other more precise methods47,48.

With two-strategy two-player29, multi-strategy two-player33, and two-strategy multiplayer games46 all thoroughly studied, the analytical solution for multi-strategy multiplayer games on graphs remains unexplored. The range of potential models for multiplayer games with more than two strategies is vast, drawing from co-evolutionary strategies such as punishment8,49,50,51, reward9,52,53, and the loner strategy4,54. Multi-strategy systems in multiplayer games have unique characteristics that multistrategy two-player games do not capture: the payoff function can be nonlinear. For example, in pool punishment, the payoff structure depends solely on whether there is at least one punishing player among the k + 1 players. This uniqueness reinforces the significance of studying multistrategy multiplayer games.

However, previous research on these games on graphs has largely been limited to numerical simulations, which do not allow for the exploration of the complete parameter space. In the absence of mathematical tools for evolutionary graph theory in multi-strategy multiplayer games, recent studies have attempted to bypass this challenge by incorporating the third strategy within the existing two strategies. For instance, punishing or rewarding behaviors have been added to the existing cooperation strategy in the traditional two-strategy system55,56. This approach allows for the examination of additional mechanisms like punishment and reward within the two-strategy system’s framework. Yet, these alternative attempts still could not capture further rich dynamics, such as cyclic dominance, which is only possible in systems with at least three strategies.

In this work, we provide an analytical framework that addresses the gaps in multi-strategy multiplayer games in the realm of evolutionary graph theory. Inspired by the Balls-and-Boxes problem, we demonstrate that for a given multi-strategy multiplayer game, counting the co-player configurations of a focal individual is equivalent to distributing k identical co-players into n distinct strategies (Fig. 1). On this basis, we develop a bottom-up approach for calculating the group-based payoff of individuals on regular graphs, deriving replicator equations on regular graphs in the weak selection limit and the absence of triangle motifs. Our results include two commonly used update rules, namely pairwise comparison (PC)57 and death-birth (DB)29, applicable to arbitrary multiplayer games with any multi-strategy space in structured populations, where each individual has the same number of neighbors. Using the punishment mechanism7,58,59 in the context of the tragedy of the commons60 as an example, we explore the well-known second-order free-riding problem analytically, obtaining an accurate threshold of punishment strength necessary to resolve the social dilemma in structured populations. Additionally, our theoretical solutions can qualitatively reproduce the phase diagrams observed in previous numerical simulation studies under non-marginal selection strength.

Fig. 1: Generalized payoff matrix for multi-strategy multiplayer games, reducible to two-strategy or two-player formats.
figure 1

a Payoff matrix for n-strategy (k + 1)-player games: with n strategies and k co-players, the matrix size is \(n\times {\mathbb{C}}(k+n-1,\, k)\). b Payoff matrix for 2-strategy (k + 1)-player games46: with n = 2 strategies and k co-players, the matrix size is \(2\times {\mathbb{C}}(k+1,\, k)\). c, Payoff matrix for n-strategy 2-player games33: with n strategies and k = 1 co-players, the matrix size is \(n\times {\mathbb{C}}(n,\, 1)\). d, Payoff matrix for 2-strategy 2-player games29: with n = 2 strategies and k = 1 co-players, the matrix size is \(2\times {\mathbb{C}}(2,\, 1)\).

Results

Model overview

We consider an infinite population on a regular graph, where each individual has k neighbors. An individual can adopt one of n strategies, labeled by the numbers 1, 2, …, n. On a regular graph, the number of co-players in every multiplayer game is equivalent to the constant number k of neighbors. For a given individual, suppose that there are k1 co-players employing strategy 1, k2 co-players employing strategy 2, and so on, up to kn co-players employing strategy n. In this context, the co-player configuration of an individual can be represented by k = (k1, k2, …, kn), which satisfies the condition \(\sum_{l=1}^{n}{k}_{l}=k\). As illustrated in Fig. 2a, counting the number of possible configurations of k is analogous to the classic Balls-and-Boxes problem, distributing k identical balls (i.e., co-players) into n distinct boxes (i.e., strategies), allowing for the possibility of empty boxes (e.g., k1 = 0). Hence, there are \({\mathbb{C}}(k+n-1,\, k)=(k+n-1)!/[(n-1)!k!]\) possible configurations of co-player strategy configurations k.

Fig. 2: The Balls-and-Boxes problem, payoff calculation, and strategy updates.
figure 2

a To determine a co-player configuration, we distribute k identical co-players (balls) into n distinct strategies (boxes), corresponding to the classic Balls-and-Boxes problem. For instance, with k = 4 and n = 3, there are 15 possible co-player configurations. b An individual accumulates payoffs from 1 + k multiplayer games, organized by itself and its k neighbors. Each multiplayer game involves k + 1 players, including the organizer and its neighbors. c Strategy updates in an individual are governed by the pairwise comparison rule. A random neighbor is selected, and the focal individual either adopts the neighbor’s strategy or maintains its original strategy, based on a probability proportional to the fitness (transformed from payoffs) in the pair.

Interaction occurs between an individual and its k co-players. In a multiplayer game involving the focal individual and k co-players, the payoff is uniquely determined by the strategy of the focal individual and the strategy configuration of the co-players. For a focal individual employing strategy i with the co-player configuration k, its payoff is denoted by aik. It can be observed that the ‘generalized payoff matrix’ comprises \(n\times {\mathbb{C}}(k+n-1,\, k)\) elements represented by aik through all possible focal strategies i = 1, 2, …, n and co-player strategy configurations k. For two-strategy two-player games (n = 2, k = 1), the number of elements in the payoff matrix reduces to \(2\times {\mathbb{C}}(2,1)=4\); for multi-strategy two-player games (k = 1), it reduces to \(n\times {\mathbb{C}}(n,1)={n}^{2}\); for two-strategy multiplayer games (n = 2), it reduces to \(2\times {\mathbb{C}}(k+1,\, k)=2(k+1)\) (Fig. 1).

The accumulated payoff of a focal individual is collected from the 1 + k games organized by itself and its neighbors, as depicted in Fig. 2b. Upon obtaining the accumulated payoffs π, we convert them into fitness, denoted as \(F=\exp (\delta \pi )\)21,48,61,62. Strategies that yield higher fitness are more likely to reproduce. Here, δ → 0+ represents a weak selection limit. The rationale behind weak selection is that, in reality, many factors other than the investigated game influence the probability of reproduction29.

There are various commonly used strategy update rules. For simplicity, we focus on the pairwise comparison (PC) rule57 in the main text (another well-known rule, the death-birth, is discussed in Supplementary Information). During each elementary step, an individual A and one of its neighbors B are randomly selected from the population. Their payoffs are computed as πA and πB and then transformed into fitness values FA and FB. Individual A adopts the strategy of individual B with a probability proportional to their fitness in the pair,

$$W=\frac{{F}_{B}}{{F}_{A}+{F}_{B}}=\frac{1}{1+\exp [-\delta ({\pi }_{B}-{\pi }_{A})]}.$$
(1)

Or, individual A keeps its own strategy with the remaining probability FA/(FA + FB). Eq. (1) indicates that individual A has a marginal tendency to either maintain its own strategy or adopt the one of individual B, depending on who has higher fitness. The evolution of strategies under the PC update process is illustrated in Fig. 2c.

Group-based payoff with any number of strategies

To formally analyze the evolutionary dynamics, we construct the system as described in Supplementary Note 1.1. According to pair approximation29, there are two key concepts, the frequency of i-players (i.e., individuals employing strategy i), denoted as xi, where \({\sum }_{i=1}^{n}{x}_{i}=1\), and the probability of an i-player being adjacent to a j-player, denoted by qij, with \({\sum }_{j=1}^{n}{q}_{j| i}=1\). By separating different time scales, we find that qij = xi(k − 2)/(k − 1) + θij/(k − 1), where θij = 1 if i = j and θij = 0 otherwise (Supplementary Note 5.1). In other words, the value of qij can be determined by the value of xi.

To express necessary computations, we introduce a variation of k, denoted as k+l = (k1, k2, …, kl + 1, …, kn), where \({\sum }_{l=1}^{n}{k}_{l}=k-1\). This represents a co-player configuration in which there is at least one l-player. Among the remaining k − 1 co-players, the numbers of players adopting strategies 1, 2, …, n are k1, k2, …, kn, respectively.

We label the payoff that an individual obtains in a multiplayer game as the single-game payoff. \({\langle {a}_{X| {{{{{{{\bf{k}}}}}}}}}\rangle }_{Y}\) is used to denote the expected single-game payoff for an X-player over the possible co-player configurations k, where the k members in k are neighbors of a Y-player, as defined by Eq. (12) in the Methods. Similarly, the notation \({\langle {a}_{X| {{{{{{{{\bf{k}}}}}}}}}_{+l}}\rangle }_{Y}\) differs in that it is over k − 1 unknown members in the possible co-player configurations k+l, with one known l-player.

Furthermore, we use the notation \(\langle {\pi }_{X}^{{{{{{{{\bf{k}}}}}}}}}\rangle\) to represent the expected accumulated payoff of an X-player obtained in the 1 + k games organized by the player and its neighbors, across all possible neighbor configurations k of the X-player, defined by Eq. (10) in the Methods. Similarly, \(\langle {\pi }_{X}^{{{{{{{{{\bf{k}}}}}}}}}_{+i}}\rangle\) denotes the expected accumulated payoff over the configurations where the remaining k − 1 neighbors are unknown besides a known i-player, as defined by Eq. (11) in the Methods.

Through bottom-up calculations from the microscopic level (Methods), we establish the following relationship between the expected accumulated and single-game payoffs. For i-players, the relation is given by

$$\left\langle {\pi }_{i}^{{{{{{{{\bf{k}}}}}}}}}\right\rangle={\langle {a}_{i| {{{{{{{\bf{k}}}}}}}}}\rangle }_{i}+k{\sum }_{l=1}^{n}{q}_{l| i}{\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+l}^{{\prime} }}\rangle }_{l}.$$
(2)

Intuitively, the expected accumulated payoff of i-players, \(\langle {\pi }_{i}^{{{{{{{{\bf{k}}}}}}}}}\rangle\), is composed by the expected single-game payoff from the game they organize, \({\langle {a}_{i| {{{{{{{\bf{k}}}}}}}}}\rangle }_{i}\), and the games organized by their k neighbors, \(k\sum_{l=1}^{n}{q}_{l| i}{\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+l}^{{\prime} }}\rangle }_{l}\). Here, the different notation \({{{{{{{{\bf{k}}}}}}}}}^{{\prime} }=({k}_{1}^{{\prime} },\, {k}_{2}^{{\prime} },\ldots,\, {k}_{n}^{{\prime} })\) from k is an independent configuration to clarify the priority in the summation.

A further concept is the expected accumulated payoff of a j-player who has at least one i-player as a neighbor, which is related to the expected single-game payoff as follows:

$$\left\langle {\pi }_{j}^{{{{{{{{{\bf{k}}}}}}}}}_{+i}}\right\rangle={\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+i}}\rangle }_{j}+{\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+i}^{{\prime} }}\rangle }_{i}+(k-1){\sum }_{l=1}^{n}{q}_{l| \, j}{\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+l}^{{\prime} }}\rangle }_{l}.$$
(3)

Here, \({\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+i}}\rangle }_{j}\), \({\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+i}^{{\prime} }}\rangle }_{i}\), and \((k-1)\sum_{l=1}^{n}{q}_{l| \, j}{\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+l}^{{\prime} }}\rangle }_{l}\) are the expected single-game payoff from the game organized by the j-player itself, the game organized by the fixed i-player neighbor, and the games organized by the remaining k − 1 neighbors of the j-player.

General replicator equations

The evolution of frequencies x1, x2, …, xn can be deduced through the microscopic strategy update process. Specifically, in an infinite population, i.e., N → , a single unit of time comprises N elementary steps, ensuring that each individual has an opportunity to update their strategy. During each elementary step, the frequency of i-players increases by 1/N when a focal j-player (where j ≠ i) is chosen to update its strategy and is replaced by an i-player. Similarly, the frequency of i-players decreases by 1/N when a focal i-player is selected to update its strategy and the player who takes the position is not an i-player. Based on this perception, we derive a simple form of the replicator equations for i = 1, 2, …, n in the weak selection limit (Supplementary Note 2):

$${\dot{x}}_{i}=\frac{\delta }{2}{x}_{i}\left(\left\langle {\pi }_{i}^{{{{{{{{\bf{k}}}}}}}}}\right\rangle -\mathop{\sum }_{j=1}^{n}{q}_{j| i}\left\langle {\pi }_{j}^{{{{{{{{{\bf{k}}}}}}}}}_{+i}}\right\rangle \right).$$
(4)

We find that Eq. (4) offers an intuitive understanding, if we introduce the following two concepts: (1) \({\pi }_{i}^{(0)}=\langle {\pi }_{i}^{{{{{{{{\bf{k}}}}}}}}}\rangle\), the expected accumulated payoff of the i-player (zero steps away on the graph), and (2) \({\pi }_{i}^{(1)}=\sum_{j=1}^{n}{q}_{j| i}\langle {\pi }_{j}^{{{{{{{{{\bf{k}}}}}}}}}_{+i}}\rangle\), the expected accumulated payoff of the i-player’s neighbors (one step away on the graph). These concepts suggest that \({\dot{x}}_{i}\propto {x}_{i}({\pi }_{i}^{(0)}-{\pi }_{i}^{(1)})\). Under pairwise comparison, the reproduction rate of i-players is dependent on how their accumulated payoff exceeds that of their neighbors. In essence, the evolution of xi is the competition between an individual and its first-order neighbors, which aligns with the results obtained by a different theoretical framework in two-strategy systems18,20,32. We further extend it to n-strategy systems in the framework of pair approximation. We also verify that the death-birth rule is essentially the competition between an individual and its second-order neighbors for n-strategy systems (Supplementary Information).

Applying Eqs. (2) and (3) to Eq. (4), we can transform the expected accumulated payoff in the replicator equations into the expected single-game payoff, as shown in Eq. (13) in the Methods, which keeps the simplest irreducible computational complexity given the payoff structure aik. In particular, we only need to calculate two types of quantities, \({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{i}\) and \({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{j}\) for i, j = 1, 2, …, n, based on the given payoff structure aik. The diagonal elements of these quantities coincide, as demonstrated in Eqs. (14) and (15) in the Methods. Therefore, for any given payoff structure aik, there are at most (2n − 1)n distinct quantities to calculate manually when determining the replicator equations. The computational complexity is thus O(n2), square of the number of strategies, which can be solved within polynomial time. We also find that the computational complexity under the death-birth rule is O(n3), cubic of the number of strategies, which can also be solved within polynomial time (Supplementary Information).

For specific payoff structures, the computational complexity can be further reduced. A common example is linear systems. In such systems, the payoff function includes at most linear terms in k1, k2, …, kn. This allows us to express the general payoff function as \({a}_{i| {{{{{{{\bf{k}}}}}}}}}={\sum }_{j=1}^{n}{b}_{ij}{k}_{j}+{c}_{i}\), where bij represents the coefficient of the linear term and ci is the constant term for i, j = 1, 2, …, n. Applying this special payoff structure to Eq. (13) in the Methods, we can obtain a simplified form of the replicator equation for linear systems,

$${\dot{x}}_{i}=\frac{\delta (k-2)}{2(k-1)}{x}_{i}\left((k+1)({\bar{\pi }}_{i}-\bar{\pi })+3\mathop{\sum }_{j=1}^{n}{x}_{j}({b}_{ii}-{b}_{ij}-{b}_{ji}-{b}_{jj})+6\mathop{\sum }_{j=1}^{n}\mathop{\sum }_{l=1}^{n}{x}_{j}{x}_{l}{b}_{jl}\right).$$
(5)

Here, \({\bar{\pi }}_{i}\) and \(\bar{\pi }\) denote the mean payoff of i-players and all players in a well-mixed population, which can be directly calculated using the traditional replicator dynamics approach (Methods).

As a frequently studied example, the public goods game involves n = 2 strategies within a linear payoff structure. Strategy 1, cooperation (C), pays a cost c which is multiplied by a synergy factor r and distributed among all k + 1 players, while strategy 2, defection (D), pays nothing. The payoff structure can be expressed as b11 = b21 = rc/(k + 1), b12 = b22 = 0, c1 = rc/(k + 1) − c, c2 = 0. Consequently, \({\dot{x}}_{i}\propto {x}_{i}({\bar{\pi }}_{i}-\bar{\pi })\), indicating that evolution favors cooperation when r > k + 1 (Supplementary Note 3.1). Coincidentally, the public goods game exhibits an equivalence between well-mixed and structured populations under pairwise comparison63, a phenomenon not necessarily observed under other update rules45. This equivalence provides a unique opportunity: when introducing additional strategies into the public goods game, the distinct effects of these new strategies in structured populations can be isolated without interference from the existing two strategies. For a general condition when pairwise comparison equates well-mixed and structured populations, we refer to the Supplementary Note 3.1.3.

We apply the multi-strategy multiplayer framework to various additional mechanisms in public goods games, including punishment8,51 (n = 3), reward64 (n = 3), and multi-stage investment65 (n = 4) (Supplementary Note 3). Here, we present the applications to two punishment types, peer and pool punishment, by which we revisit the well-known second-order free-rider problem in structured populations.

Peer punishment in public goods games

In public goods games with peer punishment66,67, the payoff structure is linear (Supplementary Note 3.2), which allows us to utilize Eq. (5) directly.

There are n = 3 strategies: 1 = Cooperation (C), 2 = Defection (D), and 3 = Peer punishment (E). Besides the two strategies in the public goods game, the third strategy, peer punishment, pays a cost α for punishing a co-player who defects. A defector, when punished, incurs a fine β. Thus, given k2 defective co-players, a punishing player has αk2 paid, and given k3 punishment co-players, a defector has βk3 charged. Furthermore, it is assumed that punishing players also perform the cooperative behavior, investing c to the common pool. This makes the strategy C the second-order free-rider who exploits the effort in punishment of strategy E.

The first question is how the behaviors of peer punishment in structured populations, obtained by our framework (Supplementary Note 3.2), differ from the ones in a well-mixed population. We find that peer punishment introduces a bi-stable space of the system state, as seen in Fig. 3a, b. Even when r < k + 1, the system can either evolve to a final state where strategies E and C coexist, or to a state dominated by strategy D, depending on the initial conditions. As the punishing fine β increases, the basin of attraction for strategy D diminishes. In a well-mixed population, strategy D maintains a basin of attraction regardless of the punishment strength (Fig. 3c). This aligns with previous findings that peer punishment does not truly resolve social dilemmas in well-mixed populations59. However, in structured populations, we observe that the basin of attraction for strategy D can be entirely eliminated if the punishing fine β exceeds a critical threshold, β > β, where

$${\beta }^{\star }=\frac{k+1}{3}\left(-\frac{rc}{k+1}+c+k\alpha \right)-\alpha.$$
(6)

Consequently, in such scenarios, the system consistently converges to a coexistence of strategies E and C (Fig. 3d). The numerical observation from previous research suggests that peer punishment can effectively resolve social dilemmas in structured populations. Our analysis adds an analytical perspective to this conclusion.

Fig. 3: Peer punishment can resolve the social dilemma of public goods game in structured populations.
figure 3

The traditional replicator dynamics produce results for well-mixed populations, and our framework allows for exploring the dynamics in structured populations. a and b The state space is bifurcated by x(DE) and \({{{{{{{{\bf{x}}}}}}}}}_{\star }^{(CE)}\). The final state, either D or (C + E)V, is determined by the initial conditions. Under mild punishment (β = 0.7), a structured population hinders cooperation by reducing the state space leading to the (C+E)V outcome. c and d Conversely, with strong punishment (β = 5), structured populations consistently result in the extinction of defection, thereby resolving the social dilemma in public goods games. In contrast, the state space in well-mixed populations remains divided into two distinct basins. e As the punishment strength β increases, \({x}_{D}^{(DE)}\) increases, expanding the initial space leading to the (C+E)V outcome. In well-mixed populations, \({x}_{D}^{(DE)}\to 1\) as β → , and the basin leading to defection cannot be completely eliminated. However, in structured populations, \({x}_{D}^{(DE)}\to k(k+1)/[(k-2)(k+3)] \, > \, 1\) when β > β, invariably resulting in the extinction of defection. f The diagram of the different effects of punishment in well-mixed versus structured populations. Structured populations are advantageous in promoting cooperation under strong punishment but are less effective when the punishment is mild. Input parameters: r = 3, c = 1, α = 0.7, k = 4.

The distinct roles of peer punishment in well-mixed and structured populations can be attributed to the fraction of defectors, \({x}_{D}^{(DE)}\), in an unstable edge equilibrium, \({{{{{{{{\bf{x}}}}}}}}}^{(DE)}=(0,\, {x}_{D}^{(DE)},\, 1-{x}_{D}^{(DE)})\), as presented in Fig. 3e. When \({x}_{D}^{(DE)} \, > \, 1\), this unstable equilibrium disappears, rendering the D-vertex equilibrium unstable. In a well-mixed population, \({x}_{D}^{(DE)} \, < \, 1\) and \({x}_{D}^{(DE)}\to 1\) as β → , indicating that the described scenario is unattainable. However, in structured populations, \({x}_{D}^{(DE)} \, > \, 1\) becomes feasible once β > β. Additionally, peer punishment acts as a double-edged sword. When \({x}_{D}^{(DE)} \, < \, 0\), the system invariably converges to the full defection state, signifying ineffective punishment. As the punishing fine β increases, peer punishment first becomes effective in well-mixed populations when \(\beta \, > \, {\beta }_{0}^{{{{{{{{\rm{WM}}}}}}}}}\). Structured populations, in contrast, require a higher β0 value for punishment to be effective. In particular, peer punishment is less advantageous in structured populations than in well-mixed populations when β < β= (Fig. 3a, b). However, structured populations gain an advantage when β > β=, and can eventually lead to the extinction of defection at sufficient high β > β values. The comparison between well-mixed and structured populations in relation to the punishing fine is illustrated in Fig. 3f, and the expressions for key β values are listed in Fig. 4.

Fig. 4: Analytical results of public goods game with peer punishment in both well-mixed and structured populations.
figure 4

This table summarizes the equilibrium points and their stability by analyzing the peer punishment system. The analytical forms of all key concepts are presented. See Supplementary Note 3.2 for details of the analysis.

We also compare the analytical predictions by our framework to the results from previous work, which was only at a numerical level. As shown in Fig. 5, we find our analytical results align qualitatively with the α-β phase diagrams presented in previous research51. Although there are differences in detail between the results obtained from non-marginal selection through numerical simulations (Fig. 5a, c) and those derived under weak selection via analytical solutions (Fig. 5b, d), both approaches consistently predict unique behaviors in structured populations that are absent in well-mixed populations. For instance, both the nonmarginal and weak selection strengths indicate the existence of a (C + E)V phase at low α and high β, where strategy D becomes extinct and strategies C and E coexist, equivalent to the Voter model68,69. Moreover, at moderate levels of α and β, we anticipate a D (C+E)V phase under weak selection. In this phase, the system evolves towards either D or (C+E)V based on the initial state, although strategy C may eventually become extinct due to the continuous introduction of a small number of defectors70. A similar phase, named Dh(E), is detected under non-marginal selection. The term ‘h’ denotes ‘homoclinic instability’, implying that strategy E can overcome D through a nucleation mechanism, particularly if a small colony of E players survives after the extinction of cooperators. This likelihood increases with larger populations.

Fig. 5: Phase diagrams of the system behavior with respect to punishing cost α and fine β are qualitatively similar under non-marginal and weak selection strength.
figure 5

a and c (the data are in agreement with those published in Figs. 3 and 4 from ref. 51), Numerical simulations under non-marginal selection (δ = 2). Here, the different phases are defined as follows: D–only D exists; Dh(E)only D or E exists based on certain probabilities; D + C + ED, C, and E coexist; (C+E)VC and E coexist like in the Voter model. b and d, The phase diagram created by analytical β0 and β under weak selection (δ → 0+), where β0 divides the D and D (C+E)V phases, β separates the D (C+E)V and (C+E)V phases. Specifically, in b, β0 = 3/17 + (3/17)α (red), β = 1 + (17/3)α (blue); in d, β0 = 3/34 + (3/17)α. The definition of the D (C+E)V phasethe system finally evolves to either the D phase or the (C+E)V phase, depending on initial conditions. Other parameters: c = 1, k = 4.

Pool punishment in public goods games

Another example is pool punishment in public goods games8,71. From the perspective of computational complexity, pool punishment differs from peer punishment in its nonlinear payoff structure, which requires utilizing Eq. (13).

Similarly, there are n = 3 strategies: 1 = Cooperation (C), 2 = Defection (D), and 3 = Pool punishment (O). Again, based on the 2-strategy public goods game, the third strategy, pool punishment, contributes a cost α to the public pool for punishment. A defector is punished with a fine β if the public pool for punishment has funds (i.e., there is at least one punisher among the co-players); if no punishers are present, the defector incurs no charge. Irrespective of the number of defecting co-players 0 ≤ k2 ≤ k, a punishing player pays α. It is also assumed that those employing pool punishment engage in cooperative behavior, investing c to the common pool, making the strategy C a second-order free-rider.

Our analysis in structured populations (Supplementary Note 3.3) and the traditional analysis for well-mixed populations reveal that pool punishment does not change the fact that the system cannot converge to a defection-free state when r < k + 1, as demonstrated in Fig. 6a, b. However, along the DO edge (i.e., without the presence of strategy C), the system can evolve to a final state of either full D or full O, depending on the initial conditions. As the punishing fine β increases, the attraction basin for strategy D shrinks. In well-mixed populations, strategy D retains an attraction basin regardless of the punishment strength (Fig. 6c). This is consistent with previous findings that pool punishment does not effectively resolve social dilemmas in well-mixed populations59. However, in structured populations, the attraction basin for strategy D can be completely eliminated if the punishing fine β exceeds a critical threshold, β > β, where

$${\beta }^{\star }=\frac{k+1}{2}\left(-\frac{rc}{k+1}+c+\alpha \right).$$
(7)

Given that O and C are still unstable, the system consequently enters a cyclic dominance pattern among D, O, and C in such scenarios (Fig. 6d). The cyclic dominance follows the sequence D → O → C → D →   . Numerical observations from previous studies suggest that pool punishment can resolve social dilemmas in structured populations by inducing a cycle of defection8. Our theoretical approach provides accurate insight into this phenomenon.

Fig. 6: Pool punishment can resolve the social dilemma of public goods game in structured populations.
figure 6

a and b In the three-strategy system space, the state consistently converges to full D. However, along the DO edge, an unstable equilibrium point, x(DO), creates a bi-stable space. In the D versus O dynamics, the final state, either D or O, is determined by the initial conditions. Under mild punishment (β = 1.2), a structured population tends to favor defection, reducing the basin leading to the O outcome. c and d Conversely, with strong punishment (β = 5), structured populations result in the cyclic dominance of the three strategies, thereby preventing the full D state in public goods games. In contrast, the state space in well-mixed populations remains two distinct basins on the DO edge, preventing cyclic dominance. e As the punishment strength β increases, \({x}_{D}^{(DO)}\) increases, expanding the initial space leading to the O outcome. In well-mixed populations, \({x}_{D}^{(DO)}\to 1\) as β → , and the basin leading to defection cannot be completely eliminated. However, in structured populations, \({x}_{D}^{(DO)}\to [{(k+1)}^{1/k}{(k-1)}^{1-1/k}-1]/(k-2) \, > \, 1\) when β > β, invariably resulting in the cyclic dominance of the three strategies. f The diagram of the different effects of punishment in well-mixed versus structured populations. Structured populations are advantageous in promoting cooperation under strong punishment but are a bit less effective when the punishment is mild. Input parameters: r = 3, c = 1, α = 0.7, k = 4.

Similarly, the distinct impacts of pool punishment in well-mixed and structured populations can be identified by the fraction of defectors, \({x}_{D}^{(DO)}\), in an unstable edge equilibrium, \({{{{{{{{\bf{x}}}}}}}}}^{(DO)}=(0,\, {x}_{D}^{(DO)},\, 1-{x}_{D}^{(DO)})\), as shown in Fig. 6e. When \({x}_{D}^{(DO)} \, > \, 1\), this unstable equilibrium vanishes, leading to instability of the D-vertex equilibrium. In well-mixed populations, \({x}_{D}^{(DO)} \, < \, 1\) and \({x}_{D}^{(DO)}\to 1\) as β → , suggesting that the described scenario is unfeasible. Conversely, in structured populations, \({x}_{D}^{(DO)} \, > \, 1\) becomes true once β > β. Pool punishment also presents a paradoxical effect: when \({x}_{D}^{(DO)} \, < \, 0\), the system consistently converges to the full defection state, even along the DO edge, indicating ineffective punishment. As the punishing fine β increases, pool punishment first becomes effective in well-mixed populations at \(\beta \, > \, {\beta }_{0}^{{{{{{{{\rm{WM}}}}}}}}}\). Structured populations, in contrast, require a bit higher β0 threshold for effective punishment. Pool punishment is less advantageous in structured populations than in well-mixed populations when β < β= (Fig. 6a, b). Nevertheless, structured populations gain an advantage when β > β=, and can eventually prevent the fixation of defection by inducing cyclic dominance among the three strategies at sufficient high β > β. The comparison between well-mixed and structured populations in relation to the punishing fine is shown in Fig. 6f, with the expressions for key β values listed in Fig. 7.

Fig. 7: Analytical results of public goods game with pool punishment in both well-mixed and structured populations.
figure 7

This table summarizes the equilibrium points and their stability by analyzing the pool punishment system. The analytical forms of all key concepts are presented. See Supplementary Note 3.3 for details of the analysis.

Again, we compare our analytical predictions to the results from previous numerical work. The analytical results are in qualitative agreement with the α-β phase diagrams from previous research8, as shown in Fig. 8. Again, while there are detailed differences between outcomes derived from non-marginal selection through numerical simulation (Fig. 8a, c) and those obtained under weak selection with analytical methods (Fig. 8b, d), both approaches indicate distinct behavioral patterns in structured populations that are not observed in well-mixed populations. For example, both non-marginal and weak selection indicate the existence of a cyclic dominance phase, (D + C + O)C, at low α and high β, where strategy D invades C, strategy C invades O, and strategy O invades D. Moreover, at moderate levels of α and β, we predict a DOD phase under weak selection. In this phase, the system consistently evolves towards full D in the three-strategy space; however, in the absence of strategy C, the system instead evolves towards either full O or full D based on the initial state. A comparable phase, named FOD, is detected under non-marginal selection. The term ‘F’ denotes ‘fixation’, which means that system evolves towards either full O or full D.

Fig. 8: Phase diagrams of the system behavior with pool punishment are qualitatively similar under non-marginal and weak selection strength.
figure 8

a and c (the data are in agreement with those published in Figs. 5 and 10 from ref. 8), Numerical simulations under non-marginal selection (δ = 2). The phases are defined as follows: D–only D exists; O–only O exists; (D+C+O)C–cyclic dominance among D, C, and O; FOD–fixation of either O or D; DOD and O coexist; (D+C+DO)C–cyclic dominance among D, C, and DO. b and d, The phase diagram is divided by analytical β0 and β under weak selection (δ → 0+). Here, β0 divides the D and DOD phases, while β separates the DOD and (D+C+O)C phases. Specifically, in b, β0 = 81/134 + (135/134)α (red), β = 3/2 + (5/2)α (blue); in d, β0 = 81/268 + (135/134)α, β = 3/4 + (5/2)α. The definition of the DOD phase–the system finally evolves to full D if cooperation is initially present, or to the fixation of either O or D in the absence of initial cooperators. Other parameters: c = 1, k = 4.

Discussion

Spatial evolutionary dynamics under weak selection can be considered as the incorporation of a marginal game effect (δ → 0+) on the Voter model68,69 (δ = 0). In structured populations, identical strategies naturally become adjacent to each other, forming clusters through neutral drift, a process independent of the game, as described by the first-order Taylor expansion in edge dynamics. This inherent tendency for the same strategies to cluster together leads to what is known as spatial reciprocity, a phenomenon captured by the second-order Taylor expansion. Simply put, under weak selection, clusters of the same strategy, caused by spatial structures, unilaterally affect the emergence of cooperation. Conversely, the evolution of cooperation does not influence the spatial pattern of these clusters. This character under weak selection reduces computational complexity, making the closed solution for various evolutionary dynamics such as multi-strategy systems on graphs possible.

In the family of evolutionary graph theory with weak selection and pair approximation, which covers two-strategy two-player, multi-strategy two-player, and two-strategy multiplayer games, we fill in the last piece of the puzzle: the multi-strategy multiplayer games. For a focal individual, we illustrate every possible configuration in which k identical co-players are distributed among n distinct strategies. On this basis, we calculate the group-based payoff for any number of strategies via a bottom-up approach. While we identify each co-player by pair approximation, the (k + 1)-player game is treated as a whole and the smallest indivisible unit in our statistical analysis. In this way, the payoff computation for the focal individual is not merely a sum of pairwise interactions, but rather an n-element function of the configuration k = (k1, k2, …, kn), determined by all co-players simultaneously. The nonlinearity of payoff functions cannot be derived from the superposition pairwise interactions, which reflects the higher-order properties of multi-strategy multiplayer games that are different from multi-strategy two-player games35.

Building on the group-based payoff calculation, we develop strategy update dynamics on regular graphs using the standard pair approximation method29,33 under two common update rules: pairwise comparison and death-birth. Interestingly, our general findings are in line with those previously obtained through a different theoretical approach for two-strategy systems18. In particular, our n-strategy replicator equations imply that pairwise comparison equates to competition among all n strategies between first-order neighbors, while death-birth is equivalent to competition among second-order neighbors. While this is consistent with the previous conclusions for two-strategy systems18, our results further extend them to the generalized n-strategy space.

It is worth mentioning that by contrasting a profile of our results with the other approach in two-strategy public goods games47,48, we can see the limitations of pair approximation: unlike their approach, which can account for triangle motifs, our pair approximation cannot. According to previous works on pair approximation45,46, we see that under the death-birth rule (i.e., second-order neighbor competition), pair approximation results align with the other approach only in the absence of triangle motifs. Under the pairwise comparison rule (i.e., first-order neighbor competition), however, triangle motifs appear to have no effect on multiplayer games48, where the results of pair approximation always match those of the other approach, which considers triangle motifs. This is one reason why pairwise comparison is the primary focus of this paper. For rigor, applying our framework to a specific network structure is best followed by our basic assumption: the absence of triangle motifs. We look forward to a new theory in the future that will cancel this assumption.

For any multi-strategy multiplayer game in our framework, we need only input the payoff function aik for each strategy i across all (k + n − 1)!/[(n − 1)!k!] co-player strategy configurations k. Then, we can apply the general formula provided in this work to obtain the replicator equations on a regular graph. For general payoff functions, we have decomposed the general replicator equation into sums of expected single-game payoffs, as shown in Eq. (13) (for PC updates) and Supplementary Eq. (S184) (for DB updates). From there, it simplifies the problem to calculating the single games under different strategy configurations and then summing them up. The computation is feasible in polynomial time, which is related to the number of strategies n. We find the computational complexity is O(n2) for pairwise comparison and O(n3) for death-birth. For certain specific payoff functions, the general formula may be further simplified, depending on whether the expected payoff across different strategy configurations has a simple primitive functional form. As an example, we provide a simple general formula for linear payoff functions in both pairwise comparison and death-birth updates, as shown in Eq. (5) and Supplementary Eq. (S187).

As an application of our theoretical framework, we revisit the second-order free-riding problem. In a simple three-strategy system of cooperation, defection, and cooperative punishment, the defection strategy is a free-rider from cooperation, while the original cooperation is also a free-rider from cooperative punishment. Prior research has shown that costly punishment in well-mixed populations cannot truly resolve social dilemmas59, although in structured populations it can8,51. We further interpret the conclusion within our analytical framework, revealing an accurate threshold for punishment strength β in both linear peer punishment and nonlinear pool punishment systems. When the punishment strength β > β, costly punishment can resolve the social dilemma in structured populations. In peer punishment, a sufficiently strong punishment eliminates the attraction basin of full defection in the bi-stable state space. In pool punishment, a strong enough punishment leads the system to a rock-paper-scissors-like cyclic dominance. The results obtained under weak selection also qualitatively reproduce the phase diagrams found in earlier numerical studies under non-marginal selection8,51, identifying unique phases observable only in structured populations.

In addition, our general n-strategy dynamics framework can reduce to classic two-strategy multiplayer game dynamics at n = 2. First, although some prior work has explored specific models under pairwise comparison55,72, to our knowledge, no work has provided a general replicator equation and discussion for two-strategy multiplayer games under pairwise comparison. As a complement to this, we discuss the general replicator equation when n = 2 for two-strategy multiplayer games under pairwise comparison in Supplementary Note 2.5. Second, the general replicator equations for two-strategy multiplayer games under death-birth have been discussed by Li et al.46. We show that our results obtained under death-birth are identical to theirs at n = 2 (Supplementary Note 4.5).

Our theoretical framework is widely applicable, yielding analytical solutions for numerous multistrategy multiplayer game models previously proposed. Besides the two punishment mechanisms investigated in the main text, we also explore the reward mechanism9,52,53,64 (a mirror mechanism to punishment) and multi-stage public goods game65 (a four-strategy system) in Supplementary Information. Classic three-strategy games remaining unexplored include tax-based reward and punishment systems10, the loner strategy4,54, and so on. Moreover, a pair of additional strategies can be introduced together to create four-strategy systems, such as the competition between peer and pool punishment51. In fact, provided coevolutionary factors expressed as payoff functions of co-player configurations, any multi-strategy multiplayer game system can be analyzed within our framework.

Methods

Bottom-up statistical quantities

Here, we provide the microscopic details behind the expected payoffs. There are several necessary variations of k for expressing the details. One is k+l, which still contains k co-players but describe a configuration with at least one l-player. The variables satisfy \(\sum_{\ell=1}^{n}{k}_{\ell }=k-1\), with the number of l-players (when  = l) written as kl + 1. Similarly, ki,+j can describe a configuration where the variables satisfy \({\sum }_{\ell=1}^{n}{k}_{\ell }=k\), with the number of i-players as ki − 1 and j-players as kj + 1. Also, note that \({{{{{{{{\bf{k}}}}}}}}}^{{\prime} }\) and k are different variables and have no relation with k. The primes are only to distinguish the sequence of summations: we do the computation on k, \({{{{{{{{\bf{k}}}}}}}}}^{{\prime} }\), and finally k.

We start from the level where k is given. Given neighbor configuration k for a focal j-player, its accumulated payoff can be expressed as

$${\pi }_{j}^{{{{{{{{\bf{k}}}}}}}}}={a}_{j| {{{{{{{\bf{k}}}}}}}}}+\mathop{\sum}_{l=1}^{n}{k}_{l}\mathop{\sum}_{{\sum}_{\ell=1}^{n}{k}_{\ell }^{{\prime} }=k-1}\frac{(k-1)!}{{\prod}_{\ell=1}^{n}{k}_{\ell }^{{\prime} }!}\left({\prod }_{\ell=1}^{n}{{q}_{\ell | l}}^{{k}_{\ell }^{{\prime} }}\right){a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+l}^{{\prime} }}.$$
(8)

The j-player accumulates payoff from the games organized by itself and its \(\sum_{l=1}^{n}{k}_{l}=k\) neighbors. The visualization of Eq. (8) is shown in Fig. 9a and b. Similarly, the accumulated payoff of an i-player neighboring a j-player, given the j-player’s neighbor configuration k, can be expressed and calculated by

$$\pi_{i| \, j}^{{{{{{\mathbf{k}}}}}}} =a_{i|{{{{{{\mathbf{k}}}}}}}_{-i,+j}}+{\sum}_{{\sum}_{l=1}^n k^{\prime}_l=k-1}\frac{(k-1)!}{{\prod}_{l=1}^n k^{\prime}_l!} \left(\mathop{\prod}_{l=1}^n {q_{l|i}}^{k^{\prime\prime}_l}\right) \\ \left(a_{i|{{{{{{\mathbf{k}}}}}}}^{\prime}_{+j}}+{\sum}_{l=1}^n k^{\prime}_l {\sum}_{{\sum}_{\ell=1}^n k^{\prime\prime}_\ell=k-1}\frac{(k-1)!}{{\prod}_{\ell=1}^n k^{\prime}_\ell!} \left({\prod}_{\ell=1}^n {q_{\ell|l}}^{k^{\prime\prime}_\ell}\right) a_{i|{{{{{{\mathbf{k}}}}}}}^{\prime\prime}_{+l}} \right).$$
(9)

The i-player accumulates payoff from the games organized by the j-player (Fig. 9c), itself (Fig. 9d), and its remaining \(\sum_{l=1}^{n}{k}_{l}^{{\prime} }=k-1\) neighbors (Fig. 9e). Further explanations of Eqs. (8) and (9) can be found in Supplementary Information.

Fig. 9: Visualization of the bottom-up statistical payoff calculation when every individual has k = 4 neighbors.
figure 9

a To calculate the group-based payoff of a focal j-player with neighbor configuration k, we first have the game organized by the j-player itself, with k as the co-player configuration. b Then, we have \({\sum }_{l=1}^{n}{k}_{l}=k\) games organized by the j-player’s neighbors who adopt strategy l = 1, 2, …, n. For the j-player, the co-player configuration contains k − 1 undetermined co-players neighboring the l-player, plus the l-player. c To calculate the group-based payoff of an i-player neighboring a focal j-player, with j-player’s neighbor configuration k, we first have the game organized by the j-player. For the i-player, the co-player configuration is ki,+j. d Then, we have the game organized by the i-player itself, with k − 1 undetermined co-players neighboring the i-player plus the j-player. e Finally, we have the games organized by the i-player’s remaining \({\sum }_{l=1}^{n}{k}_{l}^{{\prime} }=k-1\) neighbors. For the i-player, the co-player configuration contains k − 1 undetermined co-players neighboring the l-player, plus the l-player.

Based on the microscopic quantities given specific k, we can further express expected values over all possible k. The expected payoff of a focal X-player over all possible k can be statistically computed as

$$\left\langle {\pi }_{X}^{{{{{{{{\bf{k}}}}}}}}}\right\rangle={\sum}_{{\sum}_{{i}^{{\prime} }=1}^{n}{k}_{{i}^{{\prime} }}=k}\frac{k!}{{\prod}_{{i}^{{\prime} }=1}^{n}{k}_{{i}^{{\prime} }}!}\left({\prod }_{{i}^{{\prime} }=1}^{n}{{q}_{{i}^{{\prime} }| X}}^{{k}_{{i}^{{\prime} }}}\right){\pi }_{X}^{{{{{{{{\bf{k}}}}}}}}}.$$
(10)

The possible neighbor configurations satisfying \(\sum_{{i}^{{\prime} }=1}^{n}{k}_{{i}^{{\prime} }}=k\) are found around the X-player as identified by \({q}_{{i}^{{\prime} }| X}\). Here, \({i}^{{\prime} }\) is an independent count, which has no relation with i.

Similarly, the expected payoff of an i-player neighboring an X-player over all possible neighbor configuration k+i of the X-player can be expressed as

$$\left\langle {\pi }_{i| X}^{{{{{{{{{\bf{k}}}}}}}}}_{+i}}\right\rangle={\sum}_{\sum_{{i}^{{\prime} }=1}^{n}{k}_{{i}^{{\prime} }}=k-1}\frac{(k-1)!}{{\prod}_{{i}^{{\prime} }=1}^{n}{k}_{{i}^{{\prime} }}!}\left({\prod }_{{i}^{{\prime} }=1}^{n}{{q}_{{i}^{{\prime} }| X}}^{{k}_{{i}^{{\prime} }}}\right){\pi }_{i| X}^{{{{{{{{{\bf{k}}}}}}}}}_{+i}}.$$
(11)

Here, k+i is because we have a specific i-player in the neighbor configuration of the X-player. The remaining k − 1 neighbors \(\sum_{{i}^{{\prime} }=1}^{n}{k}_{{i}^{{\prime} }}=k-1\) of the X-player are found around the X-player, as identified by \({q}_{{i}^{{\prime} }| X}\).

Eqs. (9) and (11) may seem redundant because they do not appear directly in the final results. However, they are crucial in the process of deductions for both pairwise comparison and death-birth rules.

We also have the similar notation for expected single-game payoff. To specify, the expected payoff of an i-player in a single game over all co-player configuration k, where k is found neighboring an X-player, is expressed as

$${\langle {a}_{i| {{{{{{{\bf{k}}}}}}}}}\rangle }_{X}={\sum}_{\sum_{{i}^{{\prime} }=1}^{n}{k}_{{i}^{{\prime} }}=k}\frac{k!}{{\prod}_{{i}^{{\prime} }=1}^{n}{k}_{{i}^{{\prime} }}!}\left({\prod }_{{i}^{{\prime} }=1}^{n}{{q}_{{i}^{{\prime} }| X}}^{{k}_{{i}^{{\prime} }}}\right){a}_{i| {{{{{{{\bf{k}}}}}}}}}.$$
(12)

The concepts in Eqs. (8)–(12) are sufficient to identify the difference between this work and the previous literature29,33,46. In particular, they emphasize that the minimal unit to refer is the co-player configuration k, based on which the payoff of the multiplayer game is computed. We do not try to decompose the multi-body interaction identified by k into multiple pairwise interactions.

Combined with the pair approximation method33 and detailed calculations in the strategy evolution dynamics, we can then obtain the master equation (4) in the main text (see Supplementary Note 2.4.3 for the approach of pair approximation deduction).

The decomposition to single games

The following form of the master equation is important, which holds for any n-strategy multiplayer game and allows us to obtain the replicator dynamics by summing the expected payoff calculations in a series of single games:

$${\dot{x}}_{i}= \frac{\delta (k-2)}{2(k-1)}{x}_{i}{\sum }_{j=1}^{n}{x}_{j}\Bigg({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{i}+(k-1){\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{j}+{\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+i}}\rangle }_{i} -{\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+i}}\rangle }_{j}\\ -{\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+i}}\rangle }_{i}-(k-2)\mathop{\sum }_{j=1}^{n}{x}_{l}{\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+l}}\rangle }_{l}-{\langle {a}_{j| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{j}\Bigg).$$
(13)

In application, given the payoff functions aik where i = 1, 2, …, n, we can compute all 〈  〉 terms and then ensemble them to obtain the replicator equations. The result of each 〈  〉 should be a function of x1, x2, …, xn (transformed from qji manually), degree k, and game parameters.

The advantage of Eq. (13) is that we have attributed everything about 〈  〉 into two types, the ‘\({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{i}\) type’ and the ‘\({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{j}\) type’. They can be expressed by matrices through i and j:

  • The \({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{i}\) type:

$${\left[{\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{i}\right]}_{ij}=\left(\begin{array}{cccc}{\langle {a}_{1| {{{{{{{{\bf{k}}}}}}}}}_{+1}}\rangle }_{1}&{\langle {a}_{1| {{{{{{{{\bf{k}}}}}}}}}_{+2}}\rangle }_{1}&\cdots \,&{\langle {a}_{1| {{{{{{{{\bf{k}}}}}}}}}_{+n}}\rangle }_{1}\\ {\langle {a}_{2| {{{{{{{{\bf{k}}}}}}}}}_{+1}}\rangle }_{2}&{\langle {a}_{2| {{{{{{{{\bf{k}}}}}}}}}_{+2}}\rangle }_{2}&\cdots \,&{\langle {a}_{2| {{{{{{{{\bf{k}}}}}}}}}_{+n}}\rangle }_{2}\\ \vdots &\vdots &\ddots &\vdots \\ {\langle {a}_{n| {{{{{{{{\bf{k}}}}}}}}}_{+1}}\rangle }_{n}&{\langle {a}_{n| {{{{{{{{\bf{k}}}}}}}}}_{+2}}\rangle }_{n}&\cdots \,&{\langle {a}_{n| {{{{{{{{\bf{k}}}}}}}}}_{+n}}\rangle }_{n}\end{array}\right).$$
(14)
  • The \({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{j}\) type:

$${\left[{\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{j}\right]}_{ij}=\left(\begin{array}{cccc}{\langle {a}_{1| {{{{{{{{\bf{k}}}}}}}}}_{+1}}\rangle }_{1}&{\langle {a}_{1| {{{{{{{{\bf{k}}}}}}}}}_{+2}}\rangle }_{2}&\cdots \,&{\langle {a}_{1| {{{{{{{{\bf{k}}}}}}}}}_{+n}}\rangle }_{n}\\ {\langle {a}_{2| {{{{{{{{\bf{k}}}}}}}}}_{+1}}\rangle }_{1}&{\langle {a}_{2| {{{{{{{{\bf{k}}}}}}}}}_{+2}}\rangle }_{2}&\cdots \,&{\langle {a}_{2| {{{{{{{{\bf{k}}}}}}}}}_{+n}}\rangle }_{n}\\ \vdots &\vdots &\ddots &\vdots \\ {\langle {a}_{n| {{{{{{{{\bf{k}}}}}}}}}_{+1}}\rangle }_{1}&{\langle {a}_{n| {{{{{{{{\bf{k}}}}}}}}}_{+2}}\rangle }_{2}&\cdots \,&{\langle {a}_{n| {{{{{{{{\bf{k}}}}}}}}}_{+n}}\rangle }_{n}\end{array}\right).$$
(15)

There are n2 elements in each matrix. Their diagonals are equal, meaning we can compute n fewer elements. Therefore, the total amount of computation is n2 + n2 − n = (2n − 1)n elements. The computational complexity is O(n2), which can be accomplished in polynomial time (we also see that the death-birth rule’s computational complexity is O(n3), as specified in Supplementary Information).

Special linear system

Although Eq. (13) allows general calculations of any multiplayer game, we do not have to employ it directly every time. For some special payoff structures, we can deduce simplified general forms in advance. Here, we present the general results of a commonly studied subclass, the linear multiplayer games.

The linear multiplayer games in this work are defined as those whose payoff structure can be expressed as linear functions of co-player \({{{\mathbf{k}}}}=(k_1,k_2,\, \ldots ,k_n)\). That is, \({a}_{i| {{{{{{{\bf{k}}}}}}}}}={\sum }_{j=1}^{n}{b}_{ij}{k}_{j}+{c}_{i}\), where

$${{{{{{{\bf{b}}}}}}}}=\left(\begin{array}{cccc}{b}_{11}&{b}_{12}&\cdots \,&{b}_{1n}\\ {b}_{21}&{b}_{22}&\cdots \,&{b}_{2n}\\ \vdots &\vdots &\ddots &\vdots \\ {b}_{n1}&{b}_{n2}&\cdots \,&{b}_{nn}\end{array}\right),\, {{{{{{{\bf{c}}}}}}}}=\left(\begin{array}{c}{c}_{1}\\ {c}_{2}\\ \vdots \\ {c}_{n}\end{array}\right).$$
(16)

For linear multiplayer games, the payoff structure is completely determined by the matrix b and c.

Once we apply \({a}_{i| {{{{{{{\bf{k}}}}}}}}}={\sum }_{j=1}^{n}{b}_{ij}{k}_{j}+{c}_{i}\) to compute the ‘\({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{i}\) type’ and the ‘\({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{j}\) type’ as shown in Eqs. (14) and (15) and transform all qji quantities to xj quantities, we can obtain \({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{i}=(k-2){\sum }_{l=1}^{n}{b}_{il}{x}_{l}+{b}_{ii}+{b}_{ij}+{c}_{i}\), \({\langle {a}_{i| {{{{{{{{\bf{k}}}}}}}}}_{+j}}\rangle }_{j}=(k-2){\sum }_{l=1}^{n}{b}_{il}{x}_{l}+2{b}_{ij}+{c}_{i}\). Then, substituting the results into Eq. (13) leads to Eq. (5) in the main text.

As we mentioned in the main text, in Eq. (5), \({\bar{\pi }}_{i}\) and \(\bar{\pi }\) are mean payoffs of i-players and all players in a well-mixed population. They can be calculated by the traditional replicator dynamics, but to specify, using the matrix b and c, they can also be written as \({\bar{\pi }}_{i}=k{\sum }_{l=1}^{n}{x}_{l}{b}_{il}+{c}_{i}\), \(\bar{\pi }={\sum }_{i=1}^{n}{x}_{i}{\bar{\pi }}_{i}=k{\sum }_{i=1}^{n}{\sum }_{l=1}^{n}{x}_{i}{x}_{l}{b}_{il}+{\sum }_{i=1}^{n}{x}_{i}{c}_{i}\). Logically, this is how we replace the corresponding terms in Eq. (5) by \({\bar{\pi }}_{i}\) and \(\bar{\pi }\).