# Disordered ecosystems

*Last modified: June 7, 2023, 4:14 p.m.*

Disclaimer: This is work in progress.

### Contents

1.1 Deciding what ingredients to put in a model

1.2 The meaning of randomness and disorder

1.2.1 Story 1 – Objective motivation: high-dimensional dynamics

1.2.2 Story 2 – Pragmatic motivation: null models for robust phenomena

1.2.3 Story 3 – Subjective motivation: aggregation and statistical equivalence

1.3 Limitations and when they might not matter

1.4 Scope of the model

1.4.1 Focal model: Random Lotka-Volterra

1.4.2 Broader landscape of disordered models

2 Behaviors: New and old intuitions to explain what a disordered model will do

2.1 Parameterization before randomness

2.2 Interactions and aggregate parameters

2.3 Non-spatial behaviors: qualitative dynamics

2.3.1 Diffuse (weak to moderate) interactions

2.3.2 Strong (and semi-sparse?) interactions

2.3.3 Noise-dominated regime

2.4 Non-spatial behaviors: quantitative properties

2.4.1 Stability

2.4.2 Invasions and extinctions

2.5 Spatial behaviors

2.5.1 Dispersal in a homogeneous environment

2.5.2 Heterogeneous environments

3 Patterns: What can be predicted, inferred and tested empirically

3.1 Warnings and caveats

3.1.1 Measuring stability

3.1.2 Normalizing observables

3.2 Patterns and robustness

4 Extensions: Partially structured models

4.1 Types of structures

4.1.1 Few important species or links, wide distributions and sparsity

4.1.2 Discrete structures: Modularity, functional groups

4.1.3 Continuous structures: Traits, trade-offs

4.2 New patterns

4.3 Gains compared to traditional few-variable models (e.g. motifs)

5 Conclusions

Ecosystems with many species and variables can seem overwhelmingly complex, but the accumulation of complicated ingredients does not necessarily entail that the whole will behave in complicated ways (just as, conversely, simple ingredients do not always lead to a simple whole). We can imagine at least three plausible scenarios:

- Among all these details, a few matter much more than the rest and we can reduce our focus to those – we can call this “classical simplicity”, since identifying such key mechanisms is a traditional goal of ecology.
- Many or all details matter roughly equally and influence outcomes in different, weakly correlated, directions. Their combined impact will often be simpler than expected (they average out, or create random-like variation), and only certain aggregate properties persist as important causes and consequences – we can call this “emergent simplicity”.
- Many or all details matter in a deeply correlated and interconnected way. We intuit that such a situation is fragile by definition: displacing a single detail can impact all the others in a major way, like a computer program where removing one character will cause the whole to stop working. We can call this “true complexity”, and many biological examples exist within organisms. But we feel it is legitimate to wonder whether ecosystems are sufficiently closed and co-evolved to allow the emergence and persistence of such fragile structures (and whether these structures are the leading forces shaping the behavior of the entire ecosystem).

Our main ideas are that i) classical and emergent simplicity (options 1 and 2), in the right mix, might suffice to capture many aspects of complex-looking ecosystems, and ii) emergent simplicity (option 2) has not been explored enough compared to its classical counterpart, being an equally interesting starting point to improve our understanding of ecological phenomena.

We will argue that understanding emergent simplicity is largely possible through the careful invocation of randomness and probabilistic modeling, more specifically of two kinds: stochasticity, i.e. random fluctuations over time (like environmental perturbations), and disorder, i.e. randomness that is frozen in time (like a random network of interactions, unchanging on ecological time scales).

One aim of this article is to demystify models of emergent simplicity: show when they are useful, when they teach
us something new, but also when they do not, and be honest with any conceptual difficulty in applying
them^{1}. Such models,
including disordered systems, can reveal phenomena that only happen when there are many species, but also tell us when the
behavior of a large heterogeneous community is a straightforward extension of what could be understood from only two or a
few species^{2}.
Emergence is only sometimes magical, and studies of complex-looking models contain many red herrings. We believe
a honest and pragmatic presentation will show that only a few new intuitions suffice to capture the main ideas of
emergent simplicity and add it to one’s intellectual palette.

We will focus on a specific model as a baseline, Random Lotka-Volterra, because it essentially subsumes
other widely-discussed probabilistic (or “null”) models as special cases, such as neutral theory [32] or
independent species sampling [3]. More precisely, it is usually not true that one model will exactly subsume
the results of others, as they will always tend to differ in small details. But for our purposes in most
empirical situations we imagine, Random Lotka-Volterra gets close enough to approximating many other
models, and it definitely captures the main qualitative ideas and mechanisms present in these other
models^{3}.

This guide will focus on sloppy intuitions rather than rigorous mathematics, and is organized as follows:

- Sec. 1 lays out the general scope of our research program and context for why disorder is a “good idea”. A given reader may want to read it either first or last depending on taste.
- Sec. 2 is more practical, and focuses on building intuitions for how a Random Lotka-Volterra model behaves, showing what disorder changes to the way we can think about ecological communities: what is new, what is old and still valid, and what is old and wrong or irrelevant.
- Sec. 3 is meant to be a technical guide to confronting model predictions and empirical data.
- Finally, Sec. 4 discusses how to go beyond pure disorder and combine it with more classical ecological models, leading to partially-structured models.

### 1 Ingredients: the philosophy and use of disordered models

Our claim and hope is that emergent simplicity (assuming it does occur in ecosystems) makes it possible to make broad and robust statements that do not depend too much on specific model choices. Still, we wish to be very clear about which choices do matter, and when we think this approach is appropriate.

#### 1.1 Deciding what ingredients to put in a model

Let us distinguish three layers in a model and its application:

- ingredients: the mathematical form of the model, its variables and parameters (explored in Sec. 1)
- behaviors: the phenomena, dynamics, regimes etc. that a model can explore, the values that variables take as an outcome (Sec. 2)
- patterns: the observable properties, constructed from variables, that one can directly compare to empirical data while caring about observational errors and biases etc. (Sec. 3)

This tripartition (and especially splitting the usual term of “processes” into ingredients and behaviors) is useful, since different ingredients can produce the same behavior, and different behaviors can produce the same pattern, see e.g. [1].

The simplest models tend to have only one possible qualitative behavior, e.g. they predict that species abundances follow a lognormal distribution, with only quantitative aspects changing as we vary parameter values. But many models have the potential for qualitative changes of behavior: a model of competing species may display either coexistence or extinctiond; a predator-prey model may reach a fixed equilibrium or a cyclical trajectory. Different behaviors usually avail us of different patterns to compare to data, for instance we cannot study temporal correlations between species in a fixed equilibrium. Often, the existence of multiple qualitative behaviors reflects the fact that various mechanisms can overcome each other in the model.

The most common way of building a model in ecology is finding ingredients that one assumes are
important (e.g. species, nutrients, interactions...), putting them together in a reasonable way, and trying to
understand and interpret their outcomes. The tradition of null models attempts the opposite:
start from known behaviors or patterns (e.g. species diversity, abundance distribution), and try to
find model ingredients that give us these outcomes, ideally assuming as little as possible beyond
that^{4}.

In the latter perspective, many differences in ingredients, even striking ones such as being discrete or continuous, deterministic or probabilistic, spatial or non-spatial, can be treated as irrelevant if they do not impact the target outcomes. One way of articulating this idea is: all details matter, but they do not necessarily matter for what we care about. Predictions regarding different aspects of the same system may require different degrees of detail. Fidelity to microscopic ingredients is often unnecessary to make good aggregate predictions (atoms are not bouncy balls at all, but that description is enough to understand an ideal gas). Furthermore, trying to increase microscopic fidelity does not necessarily improve aggregate accuracy, and can even detract from it if poorly estimated parameters are added . Hence, we want to think of models not in terms of how they fit our microscopic intuition, but:

- what kind of macroscopic behaviors and patterns they can encompass at all (e.g. a strictly static model cannot be used to predict dynamical patterns),
- whether we can adjust model parameters so that predicted patterns fit their empirical counterparts quantitatively,
- whether parameters fitted using some patterns also correctly predict other patterns – in which case the model is a workable description of macroscopic reality, even if its details turn out to be untrue to microscopic reality.

#### 1.2 The meaning of randomness and disorder

Since our focus is on emergent simplicity and patterns that arise specifically in many-species systems, we will tend to focus on system-wide properties (aggregate quantities such as total biomass or biodiversity, statistics, distributions...), and forfeit the ability to describe the fate of specific species or individuals.

We now try to explain why this outlook naturally leads to the use of randomness. We propose three stories (all in some sense valid and grounded in mathematical arguments), in the hope that at least one of them will resonate and provide a working intuition for when the use of randomness is a good idea.

##### 1.2.1 Story 1 – Objective motivation: high-dimensional dynamics

When species differ along a single main axis, such as body size, trophic level or some trade-off between two traits, their heterogeneity and interactions are structured by that axis and it is usually necessary to incorporate it explicitly (see examples in Sec. 4).

However, when there are many independent sources of variation, their combined effect tends to behave like a random variable [5]. The canonical example is a dice throw: while a basketball throw is influenced by many factors, some of them are more important and allow prediction (e.g. the skill of the player). In a dice throw, there are many factors interfering with each other and none of them dominates, so the result is effectively random.

In a community, we may think in terms of the direct and indirect paths through which one species can impact another. If species are well-ordered into trophic levels, all paths between a top predator and a plant tend to be correlated in their impacts: the top predator may eat multiple carnivores, which eat multiple herbivores, but it all sums up into a coherent negative impact on the plant. If however there are many more pathways and modes of interaction, a species can have a negative direct impact on its prey, but a positive indirect impact by recycling nutrients, and so on. Many positive and negative indirect effects running through various other species end up interfering with each other, and acting as random noise.

In other words, randomness can be seen on objective grounds as an “attractor” in the space of all possible dynamics, in the sense that more and more complicated systems will often tend to act more and more like randomness generators.

##### 1.2.2 Story 2 – Pragmatic motivation: null models for robust phenomena

Even if we do not accept randomness as being “objectively” a good representation of the dynamics we are modelling, we can still adopt it on pragmatic grounds.

Let us say that we are dealing with intricate complex systems, devoid of any true randomness, but we are only interested in behaviors that are robust, i.e. phenomena or patterns (temporal fluctuations, abundance distributions...) that we believe would remain the same if we were to make a lot of small changes in the details of the studied system, e.g. replace a species by another cousin species, change environmental conditions a little, etc.

Then it is also plausible that the same phenomena or patterns will be observed if these details are drawn at random. Here, we do not claim that ecosystems are really random-like, but only that random systems are not exceptional and will show the same robust behaviors seen across many different non-random systems [6]. Choosing a random model is useful simply because it is easier to manipulate than most non-random ones with the same behaviors, and it thus serves as a “null model”.

##### 1.2.3 Story 3 – Subjective motivation: aggregation and statistical equivalence

A third option is the subjective Bayesian viewpoint: randomness is simply reflective of the scientist being uncertain or uncaring about certain details.

Aggregating always implies that the variables we are aggregating over (e.g. the abundances of various species) are, in some sense, equivalent or exchangeable – that we are not adding up apples and oranges unless we care about total fruit production. This does not mean that these variables are identical in every way, but only that none of them is “special” in how it contributes to the aggregate. This clearly depends on which aggregate pattern is our focus – as discussed in the next section, it may seem strange to add up the biomass of predators and prey given that they have very different ecological consequences, and yet some patterns assume an equivalence between them.

Random models appear like a natural choice when we believe that we are indeed justified in treating species as statistically equivalent for a given pattern. In a random interaction network, species are not identical, but no species occupies a very unique role – even a well-connected “hub” species is not an outlier but just a representative sample of the overall community’s distribution of connectedness.

#### 1.3 Limitations and when they might not matter

A simple objection to the use of randomness is that species typically seem to be much more different from each other than simple random draws from a distribution. To clarify this point, we can think about degrees of heterogeneity:

- “variance-like” heterogeneity: quantitative variation within a class (e.g. various pollinators, more or less specialized)
- “systematic” heterogeneity: irreducible qualitative differences between classes (e.g. autotrophs versus heterotrophs).

Disorder is primarily intended for “variance-like” heterogeneity – initially, it was proposed to model
physical systems with many particles of roughly the same nature, not ecosystems where organisms
differ utterly in scale and physiology, like viruses from trees. Differences between classes (for instance,
the fact that a plant usually cannot consume or parasitize an animal) typically must be encoded
“by hand” because they are highly consequential and structured, imposed by physics, chemistry,
etc.^{5}

But an important lesson from experience is that differences that appear irreducible and fundamental can sometimes be ignored, and entities that seem like they belong to different classes can sometimes be bunched together for practical purposes. A series of allometric relations (e.g. Sheldon’s law, Damuth’s law) suggest that there are comparable amounts of biomass and “energy flux” in all size classes of organisms, from bacteria all the way to whales [8]. While bacteria and whales are not equivalent per capita, they might in some sense be equivalent per unit mass, at least for the purpose of understanding some macroecological patterns.

Thus, whenever we think about using randomness in a model, we must first ask: what is the representation of the system that makes species as equivalent as possible? This is not necessarily the most intuitive representation. For instance, in a resource competition setting where individuals of one species consume much more than individuals of another, the correct variable to express the importance of a species might not be its abundance, but rather its total consumption (see e.g. [9]). Likewise, top predators have disproportionate impacts on food webs, per capita and even per unit mass. The question of whether rare species serve essential functions keeps recurring across ecosystem types (e.g. in microbiomes [10]). Faced with the apparent inequalities between species, we must ponder: how much is really due to some species “winning” or “losing”, being more or less important, and how much is due to the fact that we are not measuring their success in the right units? Once this is clarified, we can decide whether randomness, structure or both are needed to capture the remaining differences (i.e. whether they are variance-like or systematic).

#### 1.4 Scope of the model

We mainly discuss one basic model, the Random Lotka-Volterra model [11], which we believe can encompass a broad scope of behaviors. It has narrowly distributed random parameters: species are different but sufficiently similar to be thought as belonging to a single type (e.g. various species of competing grasses or bacteria), or they can somehow be made equivalent, see 1.3.

We can ask how representative that model is, and whether we would do better to start with other ingredients, knowing that:

- many ingredients will turn out to be irrelevant or equivalent in outcomes when there is disorder,
- some ingredients beyond the focal model are still within the scope of this guide: extensions to wide distributions and to partially structured models, discussed in Sec. 4,
- some ingredients are out of scope for this guide but within scope for disordered systems approaches in general, see Sec. 1.4.2
- some ingredients are simply incompatible with disordered systems approaches.

The latter belong to approaches that may still be valid alternate views of ecosystems, and should thus be compared and contrasted with ours, for instance:

- well-grounded reductionism, taking quantitative ingredients from physiology, physics and chemistry: network of metabolic reactions underlying microbial interactions, size-based physiological arguments, thermodynamics and redox potential, etc.
- exhaustive phenomenological descriptions e.g. management models for fisheries and forests, where we attempt to estimate every variable, flux and functional form. Harder to reconcile with our approach when every variable is of a different nature, with no possible equivalence.

##### 1.4.1 Focal model: Random Lotka-Volterra

Our focal model is a spatialized Lotka-Volterra model with growth rates ${r}_{i}$, interactions ${A}_{ij}$ (the carrying capacity is thus given by ${K}_{i}=1\u2215{A}_{ii}$), environmental perturbations ${\xi}_{i}$, and dispersal between adjacent patches $x$ and $y$ with rates ${D}_{i}$

$$\begin{array}{lll}\hfill \frac{d{N}_{i}\left(x,t\right)}{dt}={r}_{i}{N}_{i}\left(x,t\right)\left(1-\sum _{j}{A}_{ij}{N}_{j}\left(x,t\right)\right)+{\xi}_{i}\left(x,t\right)+\sum _{y}{D}_{i}\left({N}_{i}\left(y,t\right)-{N}_{i}\left(x,t\right)\right)& \phantom{\rule{2em}{0ex}}& \hfill \text{(1)}\phantom{\rule{0.33em}{0ex}}\end{array}$$ The main ecological features that this model strives to capture are, in decreasing order of generality:

- Species can be extinct (${N}_{i}=0$)
or alive, and the composition of surviving species, i.e. the set
$$\Omega =\left\{i|{N}_{i}>0\right\}$$ (2) is a major characteristic of an ecosystem state. In fact, the most basic version of the model (${\xi}_{i}={D}_{i}=0$) allows only one ecosystem state per composition: for a given set $\Omega $, assuming they can coexist at equilibrium, their abundances must verify:

$$0=1-\sum _{j\in \text{survivors}}{A}_{ij}{N}_{j}^{*}\phantom{\rule{2em}{0ex}}\phantom{\rule{2em}{0ex}}\text{forall}i.$$ (3) Such a linear system can only have one solution; however, that equilibrium need not be stable (and reachable), as we later discuss.

- Populations tend to grow or decay exponentially when rare, but this exponential growth is impacted by various factors
- We can separate instantaneous growth into:
- ${r}_{i}$ and ${\xi}_{i}$: fixed and fluctuating impacts of an environment (both abiotic and biotic) that is not itself impacted by the modelled variables,
- interactions ${A}_{ij}$ between the modelled variables; the interactions themselves being constant across space and time by assumption (but we can relax that assumption, see below)

We note that these contributions of environment and interactions, while they are separated and simply additive here in short-term growth, become unavoidably entangled in their long-term effects on abundances [12]

- Spatial fluxes can be approximated by simple “diffusion-like” dispersal between adjacent patches.

##### 1.4.2 Broader landscape of disordered models

Here we briefly discuss ingredients that are not explored in this guide, but would certainly admit a disordered treatment, and have some potential to impact the aggregate many-species patterns of interest to us:

- Linear fluxes rather than bilinear interactions: Many fundamental dynamical
properties of LV are due to the exponential tendency of population growth, i.e.
$d{N}_{i}\u2215dt$ is (mainly)
proportional to ${N}_{i}$
in (1). This tendency may be lost if we have direct fluxes
$$\frac{d{N}_{i}}{dt}=\sum _{j}{M}_{ij}{N}_{j}+...$$ (4) rather than bilinear interactions of the form ${A}_{ij}{N}_{i}{N}_{j}$. Such fluxes are found in chemostat models of abiotic nutrients, SIR models, management models like Ecosim, and they potentially capture the main qualitative features of “limited” interactions, e.g. saturated or ratio-dependent predation [13, 14]. Dispersal in (1) can play a similar role, but it is also special as it involves a conserved quantity (see below). The most obvious consequences are that these fluxes can rescue extinct species, and they induce much less extreme feedbacks and fluctuations

^{6}. But it remains to be seen when the distinction really matters, theoretically and practically, in disordered or partially-structured communities. - Internal population structure: a special case of the previous category, i.e. splitting a species into multiple subpopulations (e.g. ages, sizes, genetic types) with linear fluxes between them (e.g. aging, somatic growth, mutations or recombination). This is proposed to change the dynamics [15], notably by introducing “bottlenecks” – some outcomes of inter-species dynamics can be slowed down due to the intra-specific processes.
- Really non-additive (e.g. multiplicative) interactions: our model (1) assumes that the impacts of other species are somewhat substitutable, since their combined effect is simply the sum of their individual effects. One alternative is multiplicative interactions, where effects combine as a product: in the example of multiple essential resources, if a single resource is missing, consumer growth is zero. Long multiplicative chains of interactions would be a natural way of obtaining wide distributions (e.g. abundances). Many models of higher-order interactions combine additive and multiplicative aspects: they often use products like ${A}_{ijk...}{N}_{i}{N}_{j}{N}_{k}...$, but they usually sum over many such terms, and this additive aspect tends to outweigh the multiplicative aspect and lead to broadly similar phenomenology to that of our focal model, see e.g. [16].
- Other nonlinearities in interactions (functional responses), dispersal, etc.: in the absence of particular structure,
they often fall within “irrelevant” details, in the sense that they can impact quantitative outcomes but they do
not really alter the space of qualitative behaviors of the focus model, or only in a very straightforward
way
^{7}. - Fluctuating interactions: another way to account for hidden complexity in interactions is to make them fluctuate over time. If these fluctuations are strong enough to dwarf the fixed differences in pairwise interactions, we go back to a kind of neutral model, since no species will maintain a particular role or advantage over time through its interactions. This typically amplifies inequalities compared to random LV, up to the point of allowing “condensation” where a single species holds a non-negligible fraction of the whole system’s biomass [19].
- Conservation laws: models in physics and chemistry often behave in very particular ways because of strong laws such as the conservation of energy and matter. Such laws are typically absent in ecological models, but having invariants (symmetries, constants of motion) could drastically change the space of possible dynamics. The most obvious example here is spatial fluxes, which conserve abundance (no individual is created or killed by dispersal itself).

### 2 Behaviors: New and old intuitions to explain what a disordered model will do

Here we attempt to give a bird’s eye view of the main behaviors of the Random Lotka-Volterra model, without going through precise mathematical results, but emphasizing the basic intuitions that explain what tends to happen in that model. This section summarizes the most important intuitions exposed across a wide range of papers.

#### 2.1 Parameterization before randomness

A surprisingly subtle question is: where to put randomness, or in other words, how to parameterize the model before drawing its parameters at random (see also Sec. 1.3 for relevant ideas and caveats).

Let us start from the non-spatial, unperturbed version of the equations (1) i.e.

$$\frac{d{N}_{i}}{dt}={r}_{i}{N}_{i}\left(1-{A}_{ii}{N}_{i}-\sum _{j\ne i}^{S}{A}_{ij}{N}_{j}\right)$$ |

We see that ${A}_{ii}=1\u2215{K}_{i}$ defines the carrying capacity ${K}_{i}$, i.e. the equilibrium abundance in the absence of other species. Let us assume for now that we are looking at a competitive system where all species have a carrying capacity ${K}_{i}>0$ (though everything that follows can be extended to e.g. trophic systems where predators do not survive alone).

We could also change variables to ${\eta}_{i}={N}_{i}\u2215{K}_{i}$, usually called the relative yield [5], and still have a Lotka-Volterra model:

$$\frac{d{\eta}_{i}}{dt}={r}_{i}{\eta}_{i}\left(1-{\eta}_{i}-\sum _{j\ne i}^{S}{\alpha}_{ij}{\eta}_{j}\right)$$ | (5) |

with different interaction coefficients (${\alpha}_{ij}={A}_{ij}{K}_{j}\u2215{K}_{i}$). If the ${A}_{ij}$ are independent random parameters, then the ${\alpha}_{ij}$ are not independent: they are correlated by row and by column due to the factors ${K}_{i}$. Conversely, if the ${\alpha}_{ij}$ are uncorrelated, the ${A}_{ij}$ are correlated with each other and with the ${K}_{i}$.

This is ecologically meaningful. Assume for instance, as is often the case, that the carrying capacities ${K}_{i}$ are widely distributed (e.g. as a lognormal). Then,

- if the ${A}_{ij}$ are independent and narrowly distributed, it means that individuals from all species have comparable per capita competitive impacts on others, and thus, species with large $K$ will likely win the competition and kill others.
- by contrast, if the ${\alpha}_{ij}$ are independent and narrowly distributed, each species impacts others through ${N}_{i}\u2215{K}_{i}$, that is, how much of its carrying capacity it is occupying, no matter how small that capacity is (this seems to be the case in some experiments discussed in [5]). This is sensible if, for instance, some species consume much more resources per capita: then their $K$ is smaller, but a few individuals from that species can still impact other species a lot. Then, there is no telling whether species with large $K$ will win or lose the competition.

This goes back to our point in Sec 1.3: before drawing parameters at random, we should try to choose variables and write a model such that species are as equivalent as possible – which translates here to the fact that the parameters ($\alpha $, or $K$ and $A$, or others) should be as narrowly distributed and uncorrelated as possible. This could perhaps be automated as an optimization problem.

#### 2.2 Interactions and aggregate parameters

Our first intuitions on how interactions will matter can be obtained simply from rough orders of magnitude of parameter values.

Take for example the most basic Lotka-Volterra equation with relative yields (5) for many species, $S\gg 1$,

$$\frac{d{\eta}_{i}}{dt}={r}_{i}{\eta}_{i}\left(1-{\eta}_{i}-\sum _{j\ne i}^{S}{\alpha}_{ij}{\eta}_{j}\right)$$ |

the dynamics will tend to bring the expression in parentheses close to zero, even if it doesn’t cancel out exactly (it will if there is a stable equilibrium, but that is not always the case).

The mean and variance of the sum ${\sum}_{j}{\alpha}_{ij}{\eta}_{j}$ give a simple estimate of how much the species’ abundances will differ from their carrying capacities (we recall that ${\eta}_{i}={N}_{i}\u2215{K}_{i}$) and between species. Interactions do not matter if ${\sum}_{j}{\alpha}_{ij}{\eta}_{j}\ll 1$, and so we can call this the “weak interaction” limit, where every species mainly follows its own intra-specific dynamics.

Otherwise, we see that the sum contains $S-1$ terms of the form ${\alpha}_{ij}{\eta}_{j}$. When $S$ is large, the expression in parentheses is thus unlikely to sum to zero if all the ${\alpha}_{ij}$ and ${\eta}_{j}$ are numbers $O\left(1\right)$, meaning “of order 1” – practically speaking, often some value like 0.1 or 10. That is unlikely but not impossible, if interactions have a particular structure that allows this perfect cancellation. For random interactions, there are other possibilities: either interactions are small, ${\alpha}_{ij}\ll 1$, or very few of them are nonzero $P\left({\alpha}_{ij}\ne 0\right)\ll 1$, or abundances are small on average (compared to carrying capacity), ${\eta}_{j}\ll 1$.

Thus we can already foresee that the model has at least three main parameter regimes where interactions matter:

- Strong interactions, where a single species can play a significant role in the survival of another
$$\u27e8{\alpha}_{ij}\u27e9,\text{var}\phantom{\rule{-0.17em}{0ex}}\left({\alpha}_{ij}\right)=O\left(1\right)$$ (6) Most likely, either few species can coexist at any given time (a fixed number or one that increases slower than $S$, e.g. $\mathrm{log}S$), or many species coexist with very small abundances (e.g. a neutral model with ${\alpha}_{ij}=1$ for every interaction and ${\eta}_{i}=1\u2215S$ for every species)

- Moderate/diffuse interactions, where interactions are individually small but their sum matters (as
a collective impact of all interaction partners together), i.e. if we define aggregate parameters
$\mu $ and
$\sigma $ as
then this regime corresponds to

$$\mu ,{\sigma}^{2}=O\left(1\right)$$ (8) In that case, many species (a macroscopic fraction, i.e. a number proportional to $S$) can coexist with $\eta =O\left(1\right)$ at any given time.

- Very sparse interactions: most interactions are exactly 0, only O(1) interactions per species are nonzero (or not very small), creating a system where many species can coexist, and are affected strongly by a few other species. This case may have a tendency to behave more like few-species models, including more of a menagerie of different dynamics and behaviors than the other two cases, see e.g. [20]

#### 2.3 Non-spatial behaviors: qualitative dynamics

Here we detail model behaviors in the absence of space and dispersal. The Random Lotka-Volterra model can exhibit a whole variety of qualitatively different dynamical regimes, also called “phases”, see Fig. 1:

- Global equilibrium
- Chaotic
- Multistable
- Noise-dominated (neutral-like)

Importantly, the phase in which a system is found, as well as all its properties (e.g. abundance distribution) depend only
a few aggregate parameters, i.e. statistics of the individual species parameters: mainly the sum-of-interactions mean and
variance $\mu $ and
${\sigma}^{2}$ defined above in (7),
and a few others^{8}.
This means that we can limit ourselves to these few parameters when trying to fit data: this is not just for
practicality, but really supported by theory.

##### 2.3.1 Diffuse (weak to moderate) interactions

Three dynamical phases are predicted to exist when interactions are weak to moderate (as defined above), with suggestive empirical evidence for all three phases in [21]:

- Feasible phase: If interactions are very weak, all species coexist and reach a stable equilibrium together (only one, see 1.4.1). Theoretical conditions are given in [22]; they basically require $\sigma ~1\u2215\mathrm{log}S$, because the more species there are, the more likely it is that at least one of them will go extinct due to an unlucky draw of its interactions.
- Unique attractor phase: If interactions are moderate but their variance between species is not too large ( $\sigma =S\text{var}\phantom{\rule{-0.17em}{0ex}}\left(\alpha \right)$ is below a threshold value ${\sigma}_{c}$ of O(1) given in [11]), a fraction of species go extinct, but the remainder coexist stably in an equilibrium that cannot be invaded by any of the extinct species (if we reintroduce them, they go extinct again). The abundance distribution is typically Gaussian.
- Chaotic phase: for moderate interactions above a certain threshold of heterogeneity between species, $\sigma >{\sigma}_{c}$, equilibria lose their stability due to the complexity of interactions, for a reason that was first intuited by May [23] for complex models in general, and demonstrated for Lotka-Volterra and similar models in [24, 11, 25]. The basic intuition for this chaotic phase is the following: at any given time, the dynamics tend to approach an equilibrium where only a fraction of species survive, but all such equilibria are unstable and can be invaded by other species that were previously extinct. Thus, a constant turnover occurs through a kind of “pinball” dynamics, the system bouncing between unstable equilibria with different compositions.

When $S$ is not very large (rule of thumb: less than 20), the boundaries between these phases are not so sharp, and systems where parameters are drawn with the same statistics may fall in one regime or another depending on luck of the draw. In addition, we can get limit cycles (periodic behavior) instead of chaos. But as we increase $S$, boundaries become sharp, and cycles in the “chaotic” phase are more and more rare and complex until only chaos occurs in the limit $S\to \infty $.

The type of chaos we see here is very different from the better-known types of chaos in ecology (low-dimensional chaos due to discrete or delayed single-species or few-species dynamics, e.g. the logistic map [26]).

Finally, we note that this many-species chaos can extinguish itself: if species go extinct for good and cannot reinvade, then fluctuations in abundances tend to get species killed until the remaining $S*$ species are such that ${\sigma}^{*}={S}^{*}\u27e8\alpha \u27e9$ is below the critical value ${\sigma}_{c}$. Chaos can be maintained by having immigration (or mutation) that allows species to reinvade, and we explain below in Sec. 2.5 that it can also persist in a metacommunity [27].

##### 2.3.2 Strong (and semi-sparse?) interactions

When interactions are strong, i.e. individually $O\left(1\right)$ (e.g. ${\alpha}_{ij}=0.5$), much fewer species can coexist at any given time. We typically see two important phases:

- a “strong interaction chaos” phase, that resembles the chaotic phase described above, but with much fewer species surviving at a given time. The abundance distribution typically looks
- a multistable phase, where there isn’t a single globally stable equilibrium, but many stable equilibria that each contain a relatively small subset of coexisting species.

The latter picture contrasts with traditional ecological intuitions on multistability in multiple ways. It differs from widely discussed examples of alternative stable states: this model will not exhibit regime shifts between just two or a few very distinct states, like savannas and forests or lake eutrophication [28]; a different picture of the world is needed for those. Furthermore, the type of succession it displays is not a deterministic (Clementsian) succession of a few stereotypical plant associations, but a much more stochastic trajectory through many possible combinations.

This phase might describe e.g. grasslands with many species forming many different associations, with adjacent “spots” retaining distinct compositions despite spatial fluxes. Coexisting subsets are “cliques” of species that do not compete too strongly, and are stable against all other species invading in small numbers. But a large enough invasion (or an invasion coupled with an environmental perturbation) can cause species in the clique to go extinct and the system to jump to a different stable equilibrium corresponding to another clique.

The transition between the two phases roughly occurs when $\u27e8\alpha \u27e9=1$, through the precise boundary is theoretically unknown. It might be possible for systems to be in a state that mixes chaos and multistabilty, i.e. a dynamical landscape with multiple basins of attraction but at least some attractors are chaotic rather than equilibria. Current questions include the role of sparsity in allowing these various behaviors, see e.g. [29, 20] and for an extreme case [30].

Where diffuse interactions only allow a single equilibrium or chaos, the fact that strong interactions permit multiple stable equilibria opens up many new questions and observables (e.g. how deep are the basins of attraction of the various states). The equilibria observed in the multistable phase tend to be very numerous and to “look relatively alike”, i.e. many will tend to have comparable total biomass, diversity, etc. Nevertheless, we can see a kind of stochastic succession dynamics: Bunin has shown that equilibria can be assigned a metric of “maturity” that correlates roughly with species diversity, such that perturbations will on average cause more jumps toward more mature states [29] (see also [31] in a related model).

##### 2.3.3 Noise-dominated regime

Let us come back to the equations (1) with perturbations ${\xi}_{i}$ but without space, i.e.

$$\frac{d{N}_{i}\left(t\right)}{dt}={r}_{i}{N}_{i}\left(t\right)\left(1-\sum _{j}{A}_{ij}{N}_{j}\left(t\right)\right)+{\xi}_{i}\left(t\right)$$ |

and consider the case where ${\xi}_{i}\left(t\right)$ is a stochastic noise, i.e. a number that varies over time like a random walk (for now let us assume it is white noise, i.e. uncorrelated across time), that can represent within-species processes like individual birth and death, or external factors like weather.

If interactions create less variation over time than the perturbations, i.e.

$$\text{var}\phantom{\rule{-0.17em}{0ex}}\left(\sum _{j}{A}_{ij}{N}_{i}\left(t\right){N}_{j}\left(t\right)\right)\ll \text{var}\phantom{\rule{-0.17em}{0ex}}\left({\xi}_{i}\left(t\right)\right)$$ |

the dynamics of species $i$ will be driven by this noise ${\xi}_{i}\left(t\right)$. The behavior depends importantly on whether more abundant species are more or less perturbed, i.e. whether ${\xi}_{i}$ depends on ${N}_{i}$:

- Demographic noise (from birth-death processes) can be shown to give noise that increases like the square root
of total population abundance, i.e.
$${\xi}_{i}\left(t\right)=\sqrt{(}{N}_{i})\widehat{x}{i}_{i}\left(t\right)$$ (9) The phenomenology is then very close to the original neutral theory of ecology [32], or models of fixation of neutral mutations.

- Environmental noise (i.e. changes in conditions that randomly impact growth rates
${r}_{i}$)
gives
$${\xi}_{i}\left(t\right)={N}_{i}\phantom{\rule{0.3em}{0ex}}\widehat{x}{i}_{i}\left(t\right)$$ (10) The model then behaves like the Stochastic Logistic Equation studied by [3] or environmentally-perturbed neutral model studied by [33]. The main difference with the original neutral theory is that fluctuations are much larger and more rapid, so species can go extinct on much shorter time scales.

Three notes:

- neutral models originally assume a zero-sum game (the total number of individuals is fixed and divided among species), which can be recovered here (almost exactly) by having identical ${A}_{ij}$ for all $i,j$ (including ${A}_{ii}$). We could also assume non-interacting species ${A}_{ij}=0$ for $i\ne j$, in which case the total number of individuals obviously increases with species number, and is much more free to fluctuate over time. Such models, and everything between the two limits, have also been studied under the name of neutral models, see e.g. [34]. Many predictions will be similar across this whole range of models, e.g. abundance distributions [35, 34], while others are obviously different e.g. change of total biomass with diversity.
- some small immigration from the species pool (or mutations allowing to replace extinct species) is necessary to prevent extinction of all species in the long term; otherwise this regime only makes sense on time scales that are not too long.
- the noise-free ($\xi =0$)
chaotic phase discussed above is expected to look at lot like dynamics dominated by environmental noise: we
can rewrite the latter as
$$\frac{d{N}_{i}}{dt}={r}_{i}{N}_{i}\left(t\right)\left(1-\sum _{j}{A}_{ij}{N}_{j}\left(t\right)+\frac{{\widehat{\xi}}_{i}\left(t\right)}{{r}_{i}}\right)$$ Chaos occurs when the term ${\sum}_{j}{A}_{ij}{N}_{j}\left(t\right)$ acts like a noise perturbing the growth rate (as would environmental stochasticity), thus the consequences are similar to having nonzero $\widehat{\xi}$. The main difference is that fluctuations in the chaotic regime are created by interactions with other species, and will increase with species diversity (and go to zero when diversity falls below a threshold, see Sec. 2.3.1), whereas ${\widehat{\xi}}_{i}$ is in principle completely independent of other species. An important consequence of this is that external noise may have to be tuned to some level to preserve species diversity, neither too weak nor too strong depending on the system, whereas endogenous fluctuations (chaos) are “self-tuning”.

To conclude, our focal model somewhat subsumes neutral theory and Stochastic Logistic Equation (SLE) as particular noise-dominated limiting cases; in addition, most evidence for the SLE is also evidence for the chaotic regime of noiseless Random Lotka-Volterra.

#### 2.4 Non-spatial behaviors: quantitative properties

##### 2.4.1 Stability

Our two main questions pertaining to stability are:

- is the system without perturbations ($\xi =0$) in a globally stable equilibrium, chaotic phase, or multistable phase with multiple basins of attraction? (discussed above)
- then, how does the system respond to perturbations
${\xi}_{i}\left(t\right)$,
including
- Press perturbations, i.e. ${\xi}_{i}$ constant over time (e.g. a new permanent source of growth or mortality is added, like humans harvesting some species at a fixed rate)
- Pulse perturbations, i.e. instantaneous impacts ${\xi}_{i}\left(t\right)=\delta \left(t-{t}^{\prime}\right){\xi}_{i}$ (e.g. a sudden fire or resource bloom)
- Noise ${\xi}_{i}\left(t\right)=W\left(t\right){\xi}_{i}$ with $W$ a random walk (e.g. weather fluctuations)
- Periodic perturbations ${\xi}_{i}\left(t\right)=\mathrm{sin}\left(\omega t+\varphi \right){\xi}_{i}$ (e.g. night-day or seasonal cycles)

In systems with random interactions, we can show that stability properties are to a large extent ordered by species’ abundances, or rather by relative yields $\eta =N\u2215K$ [36]. Indeed, stability properties around an equilibrium can be studied by looking at the Jacobian matrix at that equilibrium, defined as

$${J}_{ij}=\frac{d{\u1e44}_{i}}{d{N}_{j}},\phantom{\rule{2em}{0ex}}{\u1e44}_{i}=\frac{d{N}_{i}}{dt}$$ | (11) |

In the basic Lotka-Volterra equation (without noise or space), we can see that around an equilibrium with abundances ${N}^{*}$, the Jacobian elements are simply given by

$${J}_{ij}={N}_{i}^{*}{r}_{i}{A}_{ij}$$ | (12) |

i.e. they are interactions multiplied by equilibrium abundances (and growth rates). When interactions and growth rates are random, abundance becomes the main factor distinguishing distinguishing species in their dynamics.

In particular, species that are rare (compared to their carrying capacity) have very slow dynamics compared to their growth rate, as can be seen by looking at the diagonal term of the Jacobian:

$${r}_{i}{N}_{i}^{*}{A}_{ii}={r}_{i}\frac{{N}_{i}^{*}}{{K}_{i}}\ll {r}_{i}$$ |

Thus, stability indicators such as asymptotic resilience (long-term return rate to equilibrium) mainly describe the return rate of the rarest species. Likewise, coefficient of variation (CV=std/mean) is also likely to be biased, giving a large role to rare species.

More generally, it can be shown that various metrics of stability are mainly influenced by rare species (like asymptotic resilience) or by abundant species (e.g. variance in response to environmental noise). As such, these metrics do not clearly represent “community properties”: the rest of the community could be much more or much less stable than the species that dominate a given indicator.

The fact that rarity or commonness are such strong predictors of a species’ role in community stability is because a random network does not assign special positions to species – in a structured network, contributions to stability may have more to do with the place of a species, e.g. a top predator species may have disproportionate importance despite being rare, because of where it sits in the structure. A random network offers no notion of “where”, except a vague one that can be summarized as: is the species overall favored or disfavored by all the other interactions in the network.

A systematic way of thinking about stability and comparing it across ecosystems is still lacking but see [37, 36] for intuitions, of which we a few are summarized in Table 1.

Return rate after pulse | Change after press | Variance from noise | |

$\widehat{\xi}\left(t\right)=\delta \left(t\right)$ | $\widehat{\xi}\left(t\right)=1$ | $\widehat{\xi}\left(t\right)=W\left(t\right)$ | |

Biased toward rare | Asymptotic | Immigration $\xi =\widehat{\xi}$ | |

Unbiased | Median | Environmental $\xi ={N}_{i}\widehat{\xi}$ | Demographic $\xi =\sqrt{{N}_{i}}\widehat{\xi}$ |

Biased toward common | Instantaneous | Environmental $\xi ={N}_{i}\widehat{\xi}$ | |

##### 2.4.2 Invasions and extinctions

While stability properties above mainly concern the response of a system to quantitative changes in variables or parameters, we can also study the system’s stability to adding or losing a species.

For diffuse interactions (Sec 2.3.1), individual interactions are rather weak, of order $\mu \u2215S$ or $\sigma \u2215\sqrt{S}$. Thus, if an invader causes extinctions in the community, those will typically be extinctions of rare species, due to the latter having abundances of order $1\u2215\sqrt{S}$, not due to particular interactions between these species.

For strong interactions, we can understand under which conditions invaders are likely to cause more dramatic transformations to the community, see [38]. This happens when the invader experiences a strong positive feedback, e.g. when it can grow in abundance, kill some competitors which leads to a more favorable environment with even faster growth, and so on. This, and the emergence of “keystone” species in a random network, can be understood by looking at the distribution of susceptibility ${\chi}_{ij}=d{N}_{i}\u2215d{K}_{j}$ defined in Table 2 and checking when they diverge or even become negative, as discussed in [38] and follow ups (in prep).

#### 2.5 Spatial behaviors

The introduction of space, bringing us to the full equations (1), poses two additional questions:

- How species dispersal between patches can change the dynamics, for instance by rescuing extinct species, even in an homogeneous environment where all parameters (carrying capacities, interactions...) are constant across space
- How spatial heterogeneity in parameters impacts dynamics and shapes measurable patterns

It also increases the number of important aggregate parameters quite significantly since we now need at the very least: the variance of dispersal rates ${D}_{i}$, whether they correlate with other species properties, and the spatial variance and autocorrelation of growth rates or carrying capacities.

##### 2.5.1 Dispersal in a homogeneous environment

We note that an environment can be considered homogeneous even if has variations, provided they occur on small scales (as measured by the correlation length of heterogeneity). By small, we mean that they are below the typical spatial scale of dispersal, which increases with dispersal rate and can extend over many patches even if dispersal occurs only between neighboring patches, see [39],

As for simple metapopulations, we can roughly separate three regimes of dispersal [40], starting with two extremes:

- isolated communities: dispersal is weak enough that patches are effectively isolated from each other
- well-mixed system: dispersal is strong enough that all patches are equalized and synchronized in dynamics, effectively going to a single large well-mixed patch
- in-between, we have a regime where nontrivial spatial dynamics can occur, such as asynchronous or antisynchronized fluctuations, or domains expanding and shrinking (e.g. travelling waves of species growing or going extinct, invasion fronts)

The observed regime depends on both dispersal rate and system size (number of patches or spatial extent $x\in \left[0,L\right]$). As discussed in [40], for many models and questions, interesting behaviors happen at the boundaries between these three regions.

For Random Lotka-Volterra, we can make a few more predictions:

- Under diffuse interactions, we do not see “interesting” spatio-temporal dynamics, but either equilibrium or chaotic fluctuations. With strong and somewhat sparse interactions however, we see clear domains [41] and directional dynamics with succession, community range expansion etc [11]
- As discussed in Sec. 2.3.1, chaotic dynamics tend to extinguish themselves in a well-mixed system
^{9}. However they can be rescued by having multiple patches, with dispersal strong enough to rescue extinct species but too low to synchronize dynamics; asynchronous dynamics emerge easily, requiring infinitesimally small heterogeneities in the landscape if there are many patches [27]. There is a striking effect at play here that can already be understood from a single metapopulation: a population that has on average negative growth rates in every patch can keep rescuing itself forever when patches are coupled.

##### 2.5.2 Heterogeneous environments

What happens if we let parameters such as growth rates and interactions (or just carrying capacities) vary across the landscape? All in all, these questions have barely been explored in our focal model, although ongoing work is homing in on this problem.

McGill [1] has proposed that different mechanisms will generate the same commonly-studied spatial patterns (e.g the Species-Area Relationship, see Sec. 3) if they lead to the same “stochastic geometry”, i.e. an effectively random distribution of individuals in space, where only a small number of properties matter. A particularly simple type of stochastic geometry might be one that:

- specifies the degree of clustering of individuals of the same species
- specifies the (e.g. lognormal) distribution of species abundances
- otherwise assumes species to be independently distributed in space

This idea has been picked up and discussed and tested empirically in recent articles by various groups [42, 43].

We can discuss when a given type of stochastic geometry is expected to arise from a Random Lotka-Volterra metacommunity. For instance, the above three assumptions are actually quite likely to hold in the chaotic regime with diffuse interactions, since temporal fluctuations are highly uncorrelated between species – despite the fact that these fluctuations only exist due to the very interactions between all these species. But the parameter regime that best maintains chaos is one where dispersal does not strongly couple local populations of the same species, so intraspecific spatial correlations should be rather limited. Perhaps empirical values of spatial regime will prove compatible with this chaotic regime, perhaps not.

### 3 Patterns: What can be predicted, inferred and tested empirically

Involved variables | Example observables |

Abundances $P\left(N\right)$,
composition $\Omega $
- Snapshot / Equilibrium
- Nonequilibrium stationary
${C}_{ij}=\text{cov}x,t\phantom{\rule{-0.17em}{0ex}}\left({N}_{i},{N}_{j}\right)$
$R\left(t,{t}^{\prime}\right)=\text{cov}x,i\phantom{\rule{-0.17em}{0ex}}\left({N}_{i}\left(t\right),{N}_{i}\left({t}^{\prime}\right)\right)$
| ${N}_{tot}={\sum}_{i}{N}_{i}$
${S}^{*}={\sum}_{i}\left({N}_{i}>0\right)$ Fluctuations:
$Ei\phantom{\rule{-0.17em}{0ex}}\left({C}_{ii}\right)$ Autocorrelation time |

+ Parameters $P\left(N,\theta \right)$ | Relative yield $E\phantom{\rule{-0.17em}{0ex}}\left({N}_{i}\u2215{K}_{i}\right)$ Properties of survivors [46]: - $Ei\phantom{\rule{-0.17em}{0ex}}\left({K}_{i}|{N}_{i}>0\right)-Ei\phantom{\rule{-0.17em}{0ex}}\left({K}_{i}\right)$ - $Ei,j\phantom{\rule{-0.17em}{0ex}}\left({A}_{ij}|{N}_{i},{N}_{j}\right)$[5] |

+ Perturbations
- Press
${\xi}_{i}$
Susceptibility ${\chi}_{ij}=d{N}_{i}^{*}\u2215d{\xi}_{j}$ - Extinction
${N}_{i}^{*}\to 0$
(${\xi}_{j}=-{A}_{ji}{N}_{i}^{*}$) - Invasion $0\to \u03f5$
- Pulse ${\xi}_{i}\left(t\right)=\delta \left(t-{t}^{\prime}\right){\xi}_{i}$
- Noise
${\xi}_{i}\left(t\right)=W\left(t\right){\xi}_{i}$,
$\text{cov}\phantom{\rule{-0.17em}{0ex}}\left({\xi}_{i}\left(t\right),{\xi}_{j}\left(t\right)\right)={\Sigma}_{ij}$ - Periodic ${\xi}_{i}\left(t\right)=\mathrm{sin}\left(\omega t+\varphi \right){\xi}_{i}$
| $\left|\Delta N\right|=Ei\phantom{\rule{-0.17em}{0ex}}{\left({\left({N}_{i}\left(\infty \right)-{N}_{i}\left(0\right)\right)}^{2}\right)}^{1\u22152}$
Initial, median, final return rate to
eq. Variability
$Ei\phantom{\rule{-0.17em}{0ex}}\left(\text{var}\phantom{\rule{-0.17em}{0ex}}\left({N}_{i}\left(t\right)\right)\right)$
Taylor’s law $\text{var}\phantom{\rule{-0.17em}{0ex}}\left({N}_{i}\right)$ vs $\u27e8{N}_{i}\u27e9$ and other stability-abundance relations (maybe involving ${N}_{i}\u2215{K}_{i}$ instead) [36] |

| |

All the patterns that can be predicted using a LV-type model like (1) must be constructed entirely from

- abundances ${N}_{i}\left(x,t\right)$,
- model parameters $S$, ${r}_{i}$, ${A}_{ij}$, ${D}_{i}$
- perturbations ${\xi}_{i}\left(x,t\right)$
- observation parameters: number of samples, time range, rarefaction (data pipeline)

In a disordered model, any observable must be somehow aggregated over individual species $i$, sites/samples $x$ and times $t$.

#### 3.1 Warnings and caveats

##### 3.1.1 Measuring stability

Many problems arise when trying to compare ecosystems in terms of stability. We often take time series $N\left(t\right)$ from various ecosystems, look at their temporal variability or total displacement or some such metric, and use this to conclude than an ecosystem is more or less stable than another.

But it almost never occurs that we really know how strong the perturbation $\xi $ was that induced the observed variation in abundances $N$. Meaningful metrics of stability, i.e. those we can use to estimate interaction strengths or other processes of interest, are usually not so much about “how much an ecosystem moves in total” and more about “how much an ecosystem moves relative to how much you perturb it”, hence expressions like $dN\u2215d\xi $ in Table 2.

This problem of being unable to normalize by perturbation strength is only one of a series of normalization problems, discussed next.

##### 3.1.2 Normalizing observables

Many properties must be normalized to allow for “fair” comparison between ecosystems. For instance, if we compare biomass production across ecosystems, we may want to normalize by something like system size, to have a notion of how efficient each ecosystem is relative to its size. Such a normalized metric may then be “scale-invariant” in the sense that we can meaningfully compare it across measurements done at different scales.

The choice of how to normalize is self-evident only when considering linear/additive metrics, such as biomass (the biomass of a system is always the sum of biomasses of subsystems). In that case, normalizing is always equivalent to taking an average.

There is no correct way to normalize a property like variability. Consider for instance: if variability is due to demographic noise, then we expect $\text{var}\phantom{\rule{-0.17em}{0ex}}\left(N\left(t\right)\right)$ to be proportional to N(t) (see 2.4.1), so the correct normalization is to divide the variance by the total biomass. If variability is due to environmental noise, then var(N(t)) is proportional to $N{\left(t\right)}^{2}$, so we should normalize by the square of total biomass. Choosing the wrong normalization here will automatically mean that variance decreases or increases with total biomass (and thus with scale) regardless of anything else.

What this means is that ”how to normalize” is not a property of the metric, but a property of the process: some processes will give scale-invariant metrics once normalized by $\sum N$, others by ${\left(\sum N\right)}^{2}$, others by something else or nothing at all. Only linear metrics escape this because linearity automatically entails that one should normalize by $\sum N$.

Pattern | Definition | Example studies |

Species Area Relationship | ${S}^{*}$ vs area | [4] |

Diversity-variability | $\text{var}\phantom{\rule{-0.17em}{0ex}}\left(N\right)$ vs ${S}^{*}$ | [47] |

Diversity-invadability | invasion success vs ${S}^{*}$ | [48] |

Dissimilarity-Overlap | $\text{corr}\phantom{\rule{-0.17em}{0ex}}\left(N\left(x\right),N\left(y\right)\right)$ vs $\text{corr}\phantom{\rule{-0.17em}{0ex}}\left(\Omega \left(x\right),\Omega \left(y\right)\right)$ | [49] |

#### 3.2 Patterns and robustness

We can ask two questions:

- Why are some patterns (e.g. SAD, SAR) far more studied than others?
- Is it useful to test many patterns at once, can we improve our ability to infer model parameters?

An important direction to be explored is whether certain basic or composite patterns are more robust to various types of error, noise, uncertainty:

While we can always try to assume the error distribution and infer the real pattern as latent variables, it is probably much safer to focus on patterns that are intrinsically less affected by these errors.

As an example, relative properties, such as relative change of biomass in response to a perturbation, are potentially more robust to errors of units/variables, but more prone to errors due to taking a ratio (which is bad when the error bars of the denominator can touch zero, for instance). This also connects to the problems of normalization discussed above.

What should be done in future versions of this guide is compile insights from previous attempts at fitting LV and Random LV models to data, e.g. [5, 21]. As a special case, mean interaction strength seems to be a property that we can infer quite reliably in many settings [50] whereas variance is more of a problem.

### 4 Extensions: Partially structured models

One of our main claims is that the right mix of disorder and structure could allow us to understand many-species ecosystems. This has notably been discussed theoretically in [6] (main text and examples in SI), but a general method for applying this idea to empirical communities and data analysis is still lacking.

Adding structure means adding parameters, typically ending up with as many parameters as classical few-species ecological models (e.g. tritrophic chains), or even more (since the presence of disorder means that we care not only about average interaction between e.g. trophic levels, but also about the variance of these interactions).

On the other hand, adding structure also multiplies the number of patterns that can be observed in data. This number probably grows faster than the number of parameters, but the patterns also become more and more fragile as we increase the level of detail of the structure.

#### 4.1 Types of structures

##### 4.1.1 Few important species or links, wide distributions and sparsity

This is not really structure but a very simple deviation from our basic model: having a few strong interactions and many weak (wide distributions of parameters), or just a few strong (sparse networks). We have already broached on sparsity above, but there are many dimensions to explore there. Something like a power-law distribution of interaction strength could be the focus of dedicated work, as there are known methods for dealing with this kind of distributions.

The basic intuition for the importance of this possibility is that ecology is often a combination of a small numbers game and a large numbers one: for instance, it will often occur that a single species (e.g. Daphnia in aquatic systems) has a totally disproportionate and idiosyncratic role in the dynamics, while many other species can be treated as a collective. This may also be true of interactions, with a few crucial ones for each species and many more diffuse ones. It remains to be seen whether the “exceptional” species are really qualitatively different, or merely the successful tail of a probability distribution describing all the species. All the caveats noted in Sec. 1.3 and 2.1 apply here.

##### 4.1.2 Discrete structures: Modularity, functional groups

One of the most common types of structure that can be added to a random network is blocks or modules, such as functional groups, trophic levels, core and peripheral species, etc.

Instead of having the mean and variance of all interactions as the main aggregate parameters shaping the dynamics, we now have potentially distinct means and variances for all pairs of groups, which increases the number of fitting parameters like the square of the number of groups. Stochastic Block Models [51] are a favored inference tool to estimate these matrices of parameters. It might also be possible to detect the existence groups more easily than we can infer their precise parameter values

##### 4.1.3 Continuous structures: Traits, trade-offs

Another classic type of structure is continuous properties: axes such as age or size or degree of generalism, trade-off surfaces such as the colonization-competition trade-off.

They are in some ways more complex than groups since we can imagine arbitrary continuous axes and functions, and in other ways simpler, since we expect continuity between adjacent values. Thus, rather than distinct parameters for each trait position, we can simply ask about e.g. the average slope of variation of parameters with species traits. We could imagine systematically comparing blocks and continuum descriptions: refining the block description by adding more blocks, and the continuous description by adding more derivatives, then doing model selection.

#### 4.2 New patterns

Introducing structure opens up new measurable properties, typically the same observables as before (Table 2) now resolved by their position in the structure. For instance, with a group structure:

- instead of total community biomass or abundance, we can have a more fine-grained distribution across groups, e.g. biomass pyramids
- instead of community stability, we can study response of one group to perturbation of another, e.g. trophic cascades
- rather than some community property species diversity versus some control factor like resources or area, we can look at the covariation between groups (e.g. predator diversity versus prey diversity) across the range of the control factor

When the structure is not made of groups but e.g. continuous positions along a trait axis or trade-off surface, we can instead ask about the slope of change of these properties across the axis (e.g. a continuous size spectrum)

Thus, all the patterns of interest could in principle be explored systematically by using the single-community patterns and resolving them in all possible combinations. Of course, the combinatorics probably become quite unwieldy.

#### 4.3 Gains compared to traditional few-variable models (e.g. motifs)

There will be use cases where there is practically no gain in modelling many species with partially-structured interactions rather than studying a simple few-species motif. For instance, in a bitrophic model, maybe the total biomass of consumers and resources can be predicted without including within-group disorder.

The potential value of considering many species with random parameters can come from the following sources:

- it allows capturing patterns that are meaningless without species heterogeneity, such as those listed in Sec. 4.2 above
- it gives theoretical tools to predict and test whether the traditional model indeed emerges as a robust
simplification – for instance
- it allows to deal with situations where group boundaries are not strict
- we can study chaos/complex dynamics within each group, do they prevent the group from behaving like a single variable in aggregate?

### Box: Step-by-step guide to applying disordered systems theory

This box serves as a quick summary of the points made throughout the manuscript. A disordered model will focus on the following elements:

- stable differences between species in their environmental response (random carrying capacities) and biotic interactions (random interaction network)
- fluctuating differences between species in their environmental responses (external noise) and biotic interactions (stochastic exchanges)

A step-by-step description of the process could be

- Write a model of the system that tries to make the species as equivalent as possible (Sec. 1.3), e.g. expressing them in a common currency like total biomass or total resource consumption, to check if residual heterogeneity can be treated as essentially disordered
- Introduce all elements in the model that are needed to resolve measurable behaviors and patterns (Sec. 3)
- Try to fit the disordered model to data using the most discriminating patterns, bearing in mind that some parameters are hard to resolve depending on which patterns are available to measure (e.g. it is hard to distinguish temporal fluctuations due to environmental perturbations versus species interactions)
- If disordered predictions are unsuccessful, introduce the minimal amount of structure needed to reach satisfactory predictions (Sec. 4)

### 5 Conclusions

- When dealing with many-species, high-dimensional systems, the assumption of disorder or emergent simplicity entails that many model details are of limited importance, so we can make robust predictions (qualitatively, sometimes even quantitatively).
- An important consequence is that a model like Random Lotka-Volterra has very few fitting parameters: theory predicts (and simulations confirm) that the details of interactions matter only through simple statistics such as mean and variance [6]. Where we might otherwise have assumed simple (e.g. Gaussian) interaction distributions for practicality, here we are guaranteed that there is truly no need to go to a more complicated one (under known conditions).
- The Random Lotka-Volterra model is a possible baseline to study a broad class of models that all share a few premises: population growth (and especially its exponential tendency) is the most central process to explain the phenomena we are interested in, and all other processes (e.g. species interactions, environment...) contribute in simple, essentially additive, ways to growth rate. Within this class, it potentially encompasses, qualitatively or even quantitatively, the predictions of a number of other approaches: most evidently the Stochastic Logistic Equation [3], but also neutral theory [32] (except for speciation and phylogeny-related questions), stochastic geometry [1].
- On the other hand, this model is not the be-all and end-all of disordered approaches in community ecology. We can imagine that different variables and processes are much more of a driving or limiting force for community patterns, and modelling species abundances and their exponential tendencies would not provide much explanatory power. Many such other ways to “flatten” the complexity of an ecosystem e.g. focusing on environmental spatial structure or on metabolic networks, could still benefit from ideas of disorder.
- Among empirical facts, there is a lot of focus on patterns of inequality between species (abundances, ranges, etc.). These admit multiple competing explanatiosn within the Random-Lotka Volterra model: strong inequalities could arise from broadly-distributed carrying capacities or environmental “fitness” (as in [3, 52]), from noise like neutral theory [32], or from many-species chaotic dynamics. The chaotic regime of Random LV fits with intuitions of [53] and since this chaotic phase behaves quite like a logistic equation + environmental noise, it also fits with Grilli. It gives decreasing correlations with increasing diversity, as in [43] (who give a different explanation: dilution effects)
- In a disordered interaction network, species do not have a priori roles like top predator or keystone species. Instead, an emergent outcome is that some species are lucky or unlucky, and thus abundant or rare (in some sense, usually relative to their carrying capacities). Therefore, many properties, such as how a species contributes to stability, can be predicted by its rarity. Being lucky or unlucky is a collective, context-dependent property tied to the whole network: if one species is more abundant than another in one community, the reverse is as likely to hold in another community, even if the environmental conditions are all identical.
- Chaotic and successional dynamics in many-species Random LV are very different from preexisting ecological intuitions of few-species chaos or deterministic (Clementsian) succession. They mainly rely on the exploration of many different combinations of species, either undirected in time (chaos) or directed on average by a metric of maturity (succession) even though individual jumps between compositions are largely unpredictable.

### References

[1] Brian J McGill. Towards a unification of unified theories of biodiversity. Ecology letters, 13(5):627–642, 2010.

[2] NS Jacobsen, C Lindemann, EA Martens, AB Neuheimer, K Olsson, A Palacz, AW Visser, N Wadhwa, and T Kiørboe. Characteristic sizes of life in the oceans, from bacteria to whales. Annu. Rev. Mar. Sci, 8:3–1, 2016.

[3] Jacopo Grilli. Macroecological laws describe variation and diversity in microbial communities. Nature communications, 11(1):1–11, 2020.

[4] John Harte. Maximum entropy and ecology: a theory of abundance, distribution, and energetics. OUP Oxford, 2011.

[5] Matthieu Barbier, Claire De Mazancourt, Michel Loreau, and Guy Bunin. Fingerprints of high-dimensional coexistence in complex ecosystems. Physical Review X, 11(1):011009, 2021.

[6] Matthieu Barbier, Jean-François Arnoldi, Guy Bunin, and Michel Loreau. Generic assembly patterns in complex ecological communities. Proceedings of the National Academy of Sciences, 115(9):2156–2161, 2018.

[7] Jianwei Chen, Anna Hanke, Halina E Tegetmeyer, Ines Kattelmann, Ritin Sharma, Emmo Hamann, Theresa Hargesheimer, Beate Kraft, Sabine Lenk, Jeanine S Geelhoed, et al. Impacts of chemical gradients on microbial community structure. The ISME journal, 11(4):920–931, 2017.

[8] Ian A Hatton, Ryan F Heneghan, Yinon M Bar-On, and Eric D Galbraith. The global ocean size spectrum from bacteria to whales. Science advances, 7(46):eabh3732, 2021.

[9] Jurg W Spaak and Frederik De Laender. Intuitive and broadly applicable definitions of niche and fitness differences. Ecology letters, 23(7):1117–1128, 2020.

[10] Damian W Rivett and Thomas Bell. Abundance determines the functional role of bacterial phylotypes in complex communities. Nature microbiology, 3(7):767–772, 2018.

[11] Guy Bunin. Ecological communities with lotka-volterra dynamics. Physical Review E, 95(4):042414, 2017.

[12] Matthieu Barbier, Guy Bunin, and Mathew A. Leibold. Getting more by asking for less: Linking species interactions to species co-distributions in metacommunities. in prep, 2022.

[13] H Resit Akcakaya, Roger Arditi, and Lev R Ginzburg. Ratio-dependent predation: an abstraction that works. Ecology, 76(3):995–1004, 1995.

[14] Matthieu Barbier, Laurie Wojcik, and Michel Loreau. A macro-ecological approach to predation density-dependence. Oikos, 130(4):553–570, 2021.

[15] André M de Roos. Dynamic population stage structure due to juvenile–adult asymmetry stabilizes complex ecological communities. Proceedings of the National Academy of Sciences, 118(21):e2023709118, 2021.

[16] Tobias Galla. Random replicators with asymmetric couplings. Journal of Physics A: Mathematical and General, 39(15):3853, 2006.

[17] H Rieger. Solvable model of a complex ecosystem with randomly interacting species. Journal of Physics A: Mathematical and General, 22(17):3447, 1989.

[18] Laura Sidhom and Tobias Galla. Ecological communities from random generalized lotka-volterra dynamics with nonlinear feedback. Physical Review E, 101(3):032101, 2020.

[19] Matthieu Barbier and DS Lee. Urn model for products’ shares in international trade. Journal of Statistical Mechanics: Theory and Experiment, 2017(12):123403, 2017.

[20] Stav Marcus, Ari M Turner, and Guy Bunin. Local and collective transitions in sparsely-interacting ecological communities. arXiv preprint arXiv:2110.13603, 2021.

[21] Jiliang Hu, Daniel R Amor, Matthieu Barbier, Guy Bunin, and Jeff Gore. Emergent phases of ecological diversity and dynamics mapped in microcosms. bioRxiv, 2021.

[22] Pierre Bizeul and Jamal Najim. Positive solutions for large random linear systems. Proceedings of the American Mathematical Society, 149(6):2333–2348, 2021.

[23] Robert M May. Will a large complex system be stable? Nature, 238(5364):413–414, 1972.

[24] Yoshimi Yoshino, Tobias Galla, and Kei Tokita. Statistical mechanics and stability of a model eco-system. Journal of Statistical Mechanics: Theory and Experiment, 2007(09):P09003, 2007.

[25] Tobias Galla. Dynamically evolved community size and stability of random lotka-volterra ecosystems (a). EPL (Europhysics Letters), 123(4):48004, 2018.

[26] Robert M May. Biological populations obeying difference equations: stable points, stable cycles, and chaos. Journal of Theoretical Biology, 51(2):511–524, 1975.

[27] Felix Roy, Matthieu Barbier, Giulio Biroli, and Guy Bunin. Complex interactions can create persistent fluctuations in high-diversity ecosystems. PLoS computational biology, 16(5):e1007827, 2020.

[28] Marten Scheffer, S Harry Hosper, Marie Louise Meijer, Brian Moss, and Erik Jeppesen. Alternative equilibria in shallow lakes. Trends in ecology & evolution, 8(8):275–279, 1993.

[29] Guy Bunin. Directionality and community-level selection. Oikos, 130(4):489–500, 2021.

[30] Yael Fried, David A Kessler, and Nadav M Shnerb. Communities as cliques. Scientific reports, 6(1):1–8, 2016.

[31] Mikhail Tikhonov. Community-level cohesion without cooperation. Elife, 5:e15747, 2016.

[32] Stephen P Hubbell. The unified neutral theory of biodiversity and biogeography. In The Unified Neutral Theory of Biodiversity and Biogeography. Princeton University Press, 2011.

[33] Michael Kalyuzhny, Ronen Kadmon, and Nadav M Shnerb. A neutral theory with environmental stochasticity explains static and dynamic properties of ecological communities. Ecology letters, 18(6):572–580, 2015.

[34] Bart Haegeman and Rampal S Etienne. Relaxing the zero-sum assumption in neutral biodiversity theory. Journal of Theoretical Biology, 252(2):288–294, 2008.

[35] Salvador Pueyo, Fangliang He, and Tommaso Zillio. The maximum entropy formalism and the idiosyncratic theory of biodiversity. Ecology letters, 10(11):1017–1028, 2007.

[36] Jean-François Arnoldi, Michel Loreau, and Bart Haegeman. The inherent multidimensionality of temporal variability: how common and rare species shape stability patterns. Ecology Letters, 22(10):1557–1567, 2019.

[37] J-F Arnoldi, Azenor Bideault, Michel Loreau, and Bart Haegeman. How ecosystems recover from pulse perturbations: A theory of short-to long-term responses. Journal of theoretical biology, 436:79–92, 2018.

[38] Jean-François Arnoldi, Matthieu Barbier, Ruth Kelly, György Barabás, and Andrew L Jackson. Invasions of ecological communities: Hints of impacts in the invader’s growth rate. Methods in Ecology and Evolution, 13(1):167–182, 2022.

[39] Yuval R Zelnik, Matthieu Barbier, David W Shanafelt, Michel Loreau, and Rachel M Germain. Linking intrinsic scales of ecological processes to characteristic scales of biodiversity and functioning patterns. bioRxiv, 2021.

[40] Yuval R Zelnik, Jean-François Arnoldi, and Michel Loreau. The three regimes of spatial recovery. Ecology, 100(2):e02586, 2019.

[41] Kevin Liautaud, Egbert H van Nes, Matthieu Barbier, Marten Scheffer, and Michel Loreau. Superorganisms or loose collections of species? a unifying theory of community patterns along environmental gradients. Ecology letters, 22(8):1243–1252, 2019.

[42] Julia Chacón-Labella, Marcelino de la Cruz, and Adrián Escudero. Evidence for a stochastic geometry of biodiversity: the effects of species abundance, richness and intraspecific clustering. Journal of Ecology, 105(2):382–390, 2017.

[43] Thorsten Wiegand, Andreas Huth, Stephan Getzin, Xugao Wang, Zhanqing Hao, CV Savitri Gunatilleke, and IAU Nimal Gunatilleke. Testing the independent species’ arrangement assertion made by theories of stochastic geometry of biodiversity. Proceedings of the Royal Society B: Biological Sciences, 279(1741):3312–3320, 2012.

[44] Samuel R Bray and Bo Wang. Forecasting unprecedented ecological fluctuations. PLoS computational biology, 16(6):e1008021, 2020.

[45] Felix May, Thorsten Wiegand, Sebastian Lehmann, and Andreas Huth. Do abundance distributions and species aggregation correctly predict macroecological biodiversity patterns in tropical forests? Global Ecology and Biogeography, 25(5):575–585, 2016.

[46] Guy Bunin. Interaction patterns and diversity in assembled ecological communities. arXiv preprint arXiv:1607.04734, 2016.

[47] Shaopeng Wang, Michel Loreau, Jean-Francois Arnoldi, Jingyun Fang, K Abd Rahman, Shengli Tao, and Claire de Mazancourt. An invariability-area relationship sheds new light on the spatial scaling of ecological stability. Nature Communications, 8(1):1–8, 2017.

[48] Jonathan M Levine, Peter B Adler, and Stephanie G Yelenik. A meta-analysis of biotic resistance to exotic plant invasions. Ecology letters, 7(10):975–989, 2004.

[49] Michael Kalyuzhny and Nadav M Shnerb. Dissimilarity-overlap analysis of community dynamics: Opportunities and pitfalls. Methods in Ecology and Evolution, 8(12):1764–1773, 2017.

[50] Hugo Fort. Quantitative predictions from competition theory with an incomplete knowledge of model parameters tested against experiments across diverse taxa. Ecological Modelling, 368:104–110, 2018.

[51] Felipe Vaca-Ramírez and Tiago P Peixoto. Systematic assessment of the quality of fit of the stochastic block model for empirical networks. Physical Review E, 105(5):054311, 2022.

[52] Silvia Zaoli and Jacopo Grilli. The stochastic logistic model with correlated carrying capacities reproduces beta-diversity metrics of microbial communities. PLoS computational biology, 18(4):e1010043, 2022.

[53] Jacob D O’Sullivan, Robert J Knell, and Axel G Rossberg. Metacommunity-scale biodiversity regulation and the self-organised emergence of macroecological patterns. Ecology letters, 22(9):1428–1438, 2019.

^{1}Technical difficulties will be left to appendices or future developments?

^{2}And conversely, which few-species phenomena are unlikely to matter in diverse communities.

^{3}For instance, it can recover neutral theory’s core idea that realistic patterns of inequality between species can arise even without
the “winners” having any ingrained biological advantage, simply out of chance and the self-reinforcing, rich-get-richer nature of ecological
dynamics [?].

^{4}On the spectrum between these two traditions, one extreme is represented by the purest mechanistic reductionism,
anchoring ecology in physics and chemistry where ingredients are truly knowable as exactly as possible, e.g. [2]. The other
extreme includes phenomenological approaches (statistical models, random sampling [3], Maximum Entropy [4], stochastic
geometry [1]) that postulate almost no ingredients beyond the observed patterns – though those very few ingredients, e.g.
assuming linear relations or independent samples, often turn out to play a deviously profound role in model results.
Other approaches always have a little bit of both, and there is no real sharp divide except those imposed by habit.

^{5}There are beautiful examples of capturing this structured heterogeneity through simple physical rules in the ocean in [2], or
chemical constraints such as the redox tower [7].

^{6}These are consequences that go beyond just linear fluxes, and extend to a lot of “sub-exponential behavior”. More singular
consequences arise if the dynamics are really linear.

^{7}See for instance nonlinearities in [17, 6, 18] and neural network models. In most of these cases, differences with the random LV
model are easy to guess from basic intuitions.

^{8}Notably, heterogeneity in carrying capacities $\text{var}\phantom{\rule{-0.17em}{0ex}}\left(K\right)$
and additional correlations between parameters, such as interaction symmetry
$\gamma =\text{corr}\phantom{\rule{-0.17em}{0ex}}\left({A}_{ij},{A}_{ji}\right)$ (see [6]
for intuitions regarding these parameters and their meaning).

^{9}There are very subtle caveats there that we don’t understand well: for high extinction threshold, it’s mainly transient
behaviors (typically leading to much wider fluctuations than those found later in a stationary chaotic behavior when that is
observed in metacommunities [27]) leading to all these extinctions; if you lower the extinction cutoff a lot, it seems that chaotic
dynamics are mainly extinguished by deterministic loss of diversity (i.e. species cannot survive in the long term in a single
patch unless they could do so at a fixed point). So it’s not necessarily chaos killing itself through ”activated events” (i.e. large
fluctuations touching the extinction threshold), though this may still happen in principle.