The Azimuth Project
Blog - hierarchical organization and biological evolution (part 3)

This page is a blog article in progress, written by Cameron Smith. To discuss this article while it’s being written, visit the Azimuth Forum.

An attempt to review some of the literature on major transitions in evolution and multi-level selection, sketch a few connections to concepts in category theory, and discuss the potential for using experimental evolution to investigate and strengthen those connections.

Edit :: Source :: Part 1 :: Part 2


multi-level selection

Blog - hierarchical organization and biological evolution (part 3)

A model that unifies all types of selection (chemical, sociological, genetical, and every other kind of selection) may open the way to develop a general ‘Mathematical Theory of Selection’ analogous to communication theory. -George R. Price, 1971

The Price equation (sometimes referred to as Price’s theorem) provides a statistical description of an evolutionary process. It partitions the average change in the value of the population averaged value of a trait z(Δz)z\,\,(\Delta z) between generations into components due to selection and transmission. What is meant by selection and transmission can begin to be understood from a cartoon:


Example of a selective system using the notation of the Price Equation. The initial population, the left column of beakers, is divided into subpopulations indexed by ii, where q iq_i is the fraction of the total population in the ii-th subpopulation. In this drawing, the two different kinds of transmissible material, solid and striped, are in separate subpopulations initially, but that is not necessary. Each subpopulation expresses a character value (phenotype), z iz_i. Any arbitrary rule can be used to assign trait values. Selection describes the changes in the quantities of the transmissible materials, where the primes on symbols denote the next time period. Thus q i=q iw i/w¯q'_i = q_i w_i/\bar{w} is the proportion of the descendant population derived from the ii-th subpopulation of the initial population. The transmissible material may be redistributed to new groupings during or after the selective processes. The q ji˙q'_{j \dot i} are the fractions of the ii-th parental subpopulation, after selection, that end up in the jj-th descendant subpopulation, thus jq ji˙=1\sum_j q'_{j \dot i} = 1. The new mixtures in the jj-th subpopulations express trait values y jy_j according to whatever arbitrary rules are in effect. This allows full context-dependence (non-additivity) in the phenotypic expression of the transmissible material. Descendant trait values are assigned to the original subpopulations by weighting the contributions of those subpopulations, z i= jq ji˙y jz'_i = \sum_j q'_{j \dot i} y_j. Thus, the average trait value in the descendant population is z¯= iq iz i\bar{z}' = \sum_i q'_i z'_i (Frank, 1995, Fig. 1).

The two terms in the Price equation are:

  • the covariation between the fitness relative to the population average (v i=w iw)\left(v_i=\frac{w_i}{w}\right) and the trait value,
  • and the fitness-weighted expected value of the change in the trait value between generations (Price, 1970, Price, 1972, Price, 1995).

First I’ll provide a construction of the Price equation, using slightly different notation from the figure legend above, and then discuss it’s meaning in more detail.


If n i *n_i \in \mathbb{Z}^* is the number of occurrences for each x i,y ix_i,\,y_i \in \mathbb{R} then:

The expected value of the x ix_i values weighted by n in_i is:

(1)E(x i)=def ix in i in i. \operatorname{E}(x_i) \stackrel{\mathrm{def}}{=} \frac{\sum_i x_i n_i}{\sum_i n_i}.

The covariance between the x ix_i and y iy_i values weighted by n in_i is:

(2)Cov(x i,y i)=def in i[x iE(x i)][y iE(y i)] in i=E(x iy i)E(x i)E(y i). \operatorname{Cov}(x_i,y_i) \stackrel{\mathrm{def}}{=} \frac{\sum_i n_i[x_i-\operatorname{E}(x_i)][y_i-\operatorname{E}(y_i)]}{\sum_i n_i} = \operatorname{E}(x_i y_i)-\operatorname{E}(x_i)\operatorname{E}(y_i).

Suppose there is a population (a set) wherein each individual entity has a characteristic described by a number\in \mathbb{R}. For example, high values of the number for one individual represent an increased value for that characteristic over some other individual with a lower value. Value is not equivocated here with fitness (to be defined). Let subscript ii identify the group with characteristic values z iz_i and let n in_i be the number of individuals in that group. The total number of individuals is then nn where:

n= in i. n = \sum_i n_i.

The average value of the characteristic, zz, is defined as:

(3)z=defE(z i)=1n iz in i. z \stackrel{\mathrm{def}}{=} \operatorname{E}(z_i) = \frac{1}{n} \sum_i z_i n_i.

Suppose that the population reproduces, all parents are eliminated, and there is a selection process on the offspring, by which offspring deemed less fit are removed from the reproducing population. After reproduction and selection, the population numbers for the offspring groups will change to n jn'_j. Primes or jindicesj-indices denote relevance to the offspring population, and the absence of primes or iindicesi-indices, the like for the parent population.

The total number of offspring is nn' where:

n= in i. n' = \sum_i n'_i.

The fitness of group ii will be defined to be the within-group ratio of offspring to parents:

(4)w i=n in i, w_i = \frac{n'_i}{n_i},

with average fitness of the population being

(5)w=defE(w i)=1n iw in i=1n in in in i=1n in i=nn. w \stackrel{\mathrm{def}}{=} \operatorname{E}(w_i) = \frac{1}{n} \sum_i w_i n_i = \frac{1}{n} \sum_i \frac{n'_i}{n_i} n_i = \frac{1}{n} \sum_i n'_i = \frac{n'}{n}.

The average value of the offspring characteristic will be zz' where:

(6)z = 1n jz jn j, = 1n iw iwz in i, \begin{array}{rcl} z' & = & \frac{1}{n'} \sum_{j} z'_j n'_j, \\ & = & \frac{1}{n} \sum_{i} \frac{w_i}{w} z'_i n_i, \end{array}

z iz'_i represent the average character values of the offspring of each group ii, and z jz'_j represent the character values of the offspring for the (potentially different) groupings jj. Thus, the average character value of the offspring population, zz', can be determined summing over the parent, ii, or offspring, jj, population groupings.


(Price’s theorem). The change in the population averaged trait value, Δz=zz\Delta z = z'-z, between generations is given by:

wΔz=Cov(w i,z i)+E(w iΔz i) w\,\Delta z = \operatorname{Cov}(w_i,z_i)+\operatorname{E}(w_i\,\Delta z_i)

Equation (2) shows that:

(7)Cov(w i,z i)=E(w iz i)wz \operatorname{Cov}(w_i,z_i)=\operatorname{E}(w_i z_i)-w z

Call the change in characteristic value from parent to child populations Δz i\Delta z_i so that Δz i=z iz i\Delta z_i = z'_i - z_i. As seen in Equation (1), the expected value operator E\operatorname{E} is linear, so

(8)E(w iΔz i)=E(w iz i)E(w iz i) \operatorname{E}(w_i\,\Delta z_i)=\operatorname{E}(w_i z'_i)-\operatorname{E}(w_i z_i)

Combining Equations (7) and (8) leads to

(9)Cov(w i,z i)+E(w iΔz i)=(E(w iz i)wz)+(E(w iz i)E(w iz i))=E(w iz i)wz \operatorname{Cov}(w_i,z_i)+\operatorname{E}(w_i\,\Delta z_i) = \bigl(\operatorname{E}(w_i z_i)-w z \bigr) + \bigl(\operatorname{E}(w_i z'_i)-\operatorname{E}(w_i z_i)\bigr) = \operatorname{E}(w_i z'_i)-w z

but from Equation (1) gives:

E(w iz i)=1n iw iz in i \operatorname{E}(w_i z'_i)=\frac{1}{n} \sum_i w_i z'_i n_i

and from Equation (4) gives:

(10)E(w iz i)=1n in in iz in i=1n in iz i=nn iz in in \operatorname{E}(w_i z'_i)=\frac{1}{n} \sum_i\frac{n'_i}{n_i}z'_i n_i = \frac{1}{n} \sum_i n'_i z'_i=\frac{n'}{n}\frac{\sum_i z'_i n'_i}{n'}

Applying Equations (5) and (6) to Equation (10) and then applying the result to Equation (9) gives the Price Equation:

(11)Cov(w i,z i)+E(w iΔz i)=wzwz=wΔz \operatorname{Cov}(w_i,z_i)+\operatorname{E}(w_i\,\Delta z_i)=w z'-w z=w\,\Delta z

Let the relative fitness of individual ii be:

(12)v i=defw iw v_i \stackrel{\mathrm{def}}{=} \frac{w_i}{w}

The relative fitness formulation of the Price equation is:

(13)Δz=Cov(v i,z i)+E(v iΔz i) \Delta z = \operatorname{Cov}(v_i,z_i)+\operatorname{E}(v_i\,\Delta z_i)

Combining Equation (12) with Theorem 1 and the definitions of the E\operatorname{E} and Cov\operatorname{Cov} operators in Equations (1) and (2) respectively:

(14)Δz = Cov(w iw,z i)+E(w iwΔz i) = Cov(v i,z i)+E(v iΔz i) \begin{array}{rcl} \Delta z & = & \operatorname{Cov}\left(\frac{w_i}{w},z_i\right)+\operatorname{E}\left(\frac{w_i}{w}\,\Delta z_i\right) \\ & = & \operatorname{Cov}(v_i,z_i)+\operatorname{E}(v_i\,\Delta z_i) \end{array}

It is essential to emphasize the change of indices between jj (offspring) and ii (parents) in Equation (6). This indicates that the descendant population can have different groupings from the parent population. If we actually want to write down an explicit method of computing z iz'_i, we have to have kept track of something about the relationship between the groupings in the parent and child populations:

z i=1n i jn ijz j, z'_i = \frac{1}{n_i} \sum_j n_{i j} z'_j,

where n ijn_{i j} represents the number of individuals in group jj of the offspring population derived from group ii of the parent population (see the figure legend above and (Frank, 1995)). This seems to be a significant limitation because it requires interrogation of the distribution of the offspring population. In fact, it also implies that the price equation is purely retrospective rather than predictive.

What is the meaning of the Price equation? The first term, Cov(v i,z i)\operatorname{Cov}(v_i,z_i), measures the statistical relationship between the trait value, zz, and the relative fitness vv. That is, if higher values for a particular trait are positively correlated with relative fitness this term will be positive and if negatively correlated it will be negative. If there is no statistical association or if there is variation neither in relative fitness nor trait value, then Cov(v i,z i)=0\operatorname{Cov}(v_i,z_i)=0. Thus, Cov(v i,z i)\operatorname{Cov}(v_i,z_i) measures the degree to which the trait in question is subject to selection, thereby encapsulating the organism-environment relationship for a single trait. It is sometimes called the selection differential. The second term in the Price equation, E(v iΔz i)\operatorname{E}(v_i\,\Delta z_i), is a measure of the so-called transmission bias of the trait. Each individual in the population has a Δz i\Delta z_i, so E(v iΔz i)\operatorname{E}(v_i\,\Delta z_i) is a fitness-weighted expectation that relays both the degree to which the offspring of individual ii deviate from it in trait value as well as its number of offspring.

What is perhaps most interesting about the Price equation is that it can be expanded into what might be conceptualized as an arbitrary number of hierarchical levels due to its recursive nature. To my knowledge, Arnold and Fristrup were the first to describe this in detail (Arnold and Fristrup, 1982). The following is a construction of the hierarchical formulation of the Price equation for an arbitrary number of hierarchical levels.


Given an ordered set of sets of indices ={𝕂,𝕃,,𝕄} \mathfrak{I} = \{ \mathbb{K}, \mathbb{L}, \ldots\, , \mathbb{M} \} where 𝕂={k|k *,k=1K}\mathbb{K}= \{ k | k \in \mathbb{Z}^*, k = 1 \ldots\, K \} there are operations on the index sets such that 𝕂1=𝕃\mathbb{K} \oplus 1 = \mathbb{L} and 𝕃1=𝕂\mathbb{L} \ominus 1 = \mathbb{K}.


The nested hierarchical form of the Price equation is:

(15)wΔz= Cov k(w k,z k) +E k(Cov l(w kl,z kl) +E l(E m1(Cov m(w klm,z klm) +E m(w klmΔz klm)))). \begin{split} w\,\Delta z =& \operatorname{Cov}_k(w_k,z_k) \\ & +\, \operatorname{E}_k ( \operatorname{Cov}_l(w_kl,z_kl) \\ & +\, \operatorname{E}_l (\,\, \cdots\,\, \operatorname{E}_{m \ominus 1}( \operatorname{Cov}_m(w_{k l\, \cdots\, m},z_{k l\, \cdots\, m}) \\ & +\, \operatorname{E}_m(w_{k l\, \cdots\, m} \Delta z_{k l\, \cdots\, m}))\,\, \cdots \,\,)). \end{split}

The number of levels in the hierarchy is equal to |||\mathfrak{I}|.


Equation (15) is a direct result of writing the terms describing wΔzw\,\Delta z in succession:

(16)wΔz = Cov(w k,z k)+E(w kΔz k), w kΔz k = Cov l(w kl,z kl)+E l(w klΔz kl), w klΔz kl = Cov l1(w kl(l1),z kl(l1))+E l1(w kl(l1)Δz kl(l1)), w klm1Δz klm1 = Cov m(w klm,z klm)+E m(w klmΔz klm). \begin{array}{rcl} w\,\Delta z &=& \operatorname{Cov}(w_k,z_k)+\operatorname{E}(w_k\,\Delta z_k), \\ w_k\,\Delta z_k &=& \operatorname{Cov}_l(w_{k l},z_{k l})+\operatorname{E}_l(w_{k l}\,\Delta z_{k l}), \\ w_{k l}\,\Delta z_{k l} &=& \operatorname{Cov}_{l \oplus 1}(w_{k l (l \oplus 1)},z_{k l (l \oplus 1)})+\operatorname{E}_{l \oplus 1}(w_{k l (l \oplus 1)}\,\Delta z_{k l (l \oplus 1)}), \\ &\vdots& \\ w_{k l \cdots\, m \ominus 1}\,\Delta z_{k l \cdots\, m \ominus 1} &=& \operatorname{Cov}_m(w_{k l \cdots\, m},z_{k l \cdots\, m}) + \operatorname{E}_m(w_{k l \cdots\, m} \Delta z_{k l \cdots\, m}). \end{array}

Equation (15) results from substitution of each of Equations (16) into its immediate predecessor until the first of the set of equations (16) is reached.

The meaning of the hierarchical formulation of the Price equation can be understood by considering each of Equations (16) as encapsulating the separation of the evolutionary process of some abstract characteristic into selection and transmission components each for a particular hierarchical level. Equation (15) is simply the composition of all of Equations (16). In their current form Equations (16) cannot be viewed as a mapping with a domain and co-domain. If we were to try to use the difference Equations (16) to update the necessary input data over a series of iterations, then defining the domain appears to require knowledge of characteristic values decomposed into the relevant set of hierarchical levels and summarized in some tensor Z KLM\mathbf{Z}_{K L\, \cdots\, M} with dim{Z}=K×L××M\operatorname{dim} \{ \mathbf{Z} \} = K \times L \times \cdots\, \times M. A similar tensor is required to describe the fitness values {w klm}\{w_{k l\, \cdots\, \m} \}. In each case, however, we need equivalent tensors for the offspring population in order to compute the z iz'_i. In order to update the characteristic value tensor for a single iteration we would have to add an average over a dimension to each indexed element in that dimension. Thus, information required to fully reconstruct rather than summarize statistically the character value distribution of the offspring of each parent or parent group would be lost in each generation. In summary, as mentioned above, despite its potential usefulness as a conceptual tool, in its current form we could only use the Price equation to compare to an experiment if we had complete data for both the parent and offspring populations over all generations of interest. Later I’ll attempt to derive a stochastic version of the Price equation that might be more useful in attempting to embed it, or something else, within a higher-level framework.

I’ll probably need a lot of help with this over on the forum!

Another limitation of the Price equation is that the fitness values do not evolve themselves. Referring back to the organism-environment duality it appears here that, while organisms in the environment are dynamic, the environment, which ultimately defines fitness is static. While this may be true over appropriately defined timescales, it is certainly not true in general. A model taking into account this intuitive relationship, while refusing to let go of a distinction between organisms and environments, would allow for the coevolution of the environment (thus fitness). Alternatively, we could take the point of view that an organism, indeed all organisms, simply represent a particular subset of hierarchical levels embedded within a larger network representing the “environment”. In this context, the underlying levels, molecules, organisms, populations, communities, ecosystems, and the overlying levels may all be subsumed into a single framework wherein something like an “organism” would simply represent a pattern that can be detected via some, ideally analytical, means within a single “global” type of network. This is perhaps where the project I’m working on converges in some sense with that of John and Jacob.

a categorical view of evolutionary processes

Blog - hierarchical organization and biological evolution (part 3)

This is clearly the least developed and probably most important section with which I will require the most help if it is to become comprehensible or useful!: Azimuth Forum.

Among the capabilities of the categorical language applied to the science of complex systems as described in Ehresmann and Vanbremeersch are, apparently, a characterization of the interface between simple and complex, a statement of the necessary conditions for reductionist theories to be successful in recapitulating natural systems, and a corresponding identification of the range of considerations necessary to construct reliable models when reductionism is insufficient. All of these constructions discussed in the first several chapters of MES are intellectually tantalizing, and it will be interesting to see where all of the directions suggested in MES could lead us. In any case, what I hope to discuss here, is a bit more modest.

Although recklessly premature, I cannot resist juxtaposing these two figures:

There is probably some other structure (or at the very least a series of adjectives may need to be added for any modicum of precision) more appropriate for embedding either a model of hierarchical evolution, or the hierarchical price equation-which are probably not the same given that the former expresses an ideal and the latter a practical result in the potential direction of that ideal. The rough notion of an evolutionary process modeled as a fibration wherein there is a, perhaps nn-, category representing a collection of interacting objects that constitute an evolving system whose slicing parameter is identified with time. This is how EV define it:


If K\mathbf{K} is an evolutive system, the fibration FK\mathbf{FK} associated to K\mathbf{K} is a quasi-category which has for its set of objects the set |K||K| of all the objects of K\mathbf{K}, and which is generated by its following sub-categories (see the figure above):

  1. The configuration categories K tK_t for each tt; their links are called vertical links. K tK_t is also called the fiber at tt.
  2. The category associated to the order ‘earlier than or simultaneous with’ on |K||K|; these links are called horizontal links.

There are many definitions before and after this one in MES; however, the meaning of EV’s term evolutive system is probably the least decipherable without a slightly more explicit definition:


An evolutive system (or ESES) K\mathbf{K} consists of the following (see figure below):

  1. A time scale TT, which is an interval or a finite subset of *\mathbb{Z}^*.
  2. For each instant tt of TT, a category K tK_t called the configuration category at tt. These categories are disjoint.
  3. For each instant t>tt' \gt t, a partial functor k(t,t)k(t,t') from K tK_t to K tK_{t'}, called the transition from tt to tt'. These transitions satisfy the following transitivity condition (TC), given t<t<tt \lt t' \lt t'' in TT:

: (TC) If the object A tA_t has A tA_{t'} for its new configuration at tt', and if A tA_{t'} has a new configuration A tA_{t''} at tt'', then A tA_{t''} is also the image of A tA_t by the transition from tt to tt''. Conversely, if B tB_t transitions to a configuration B tB_{t'} at tt' and to a configuration B tB_{t''} at tt'', then B tB_{t'} must transition to a configuration at tt'', and this configuration is B tB_{t''}. Similarly for the links.


Blog - hierarchical organization and biological evolution (part 2)

T. F. H. Allen and T. B. Starr, Hierarchy: Perspectives for Ecological Complexity. Chicago: University of Chicago Press, 1982, p. 326. \hookleftarrow

A. J. Arnold and K. Fristrup, The theory of evolution by natural selection: a hierarchical expansion, Paleobiology, vol. 8, no. 2, pp. 113–129, 1982. \hookleftarrow

A. C. Ehresmann and J. P. Vanbremeersch, Memory Evolutive Systems; Hierarchy, Emergence, Cognition, Volume 4 (Studies in Multidisciplinarity). Elsevier Science, 2007, p. 402. \hookleftarrow

G. L. Farre, The Energetic Structure of Observation: A Philosophical Disquisition, American Behavioral Scientist, vol. 40, no. 6, pp. 717-728, May. 1997. \hookleftarrow

S. A. Frank, George Price’s contributions to evolutionary genetics., Journal of theoretical biology, vol. 175, no. 3, pp. 373-88, Aug. 1995. 1\hookleftarrow^1 2\hookleftarrow^2

S. A. Frank, Foundations of social evolution. Princeton Univ Press, 1998. \hookleftarrow

S. Okasha, Evolution and the levels of selection. New York: Oxford University Press, USA, 2006. \hookleftarrow

G. R. Price, Selection and Covariance, Nature, vol. 227, no. 5257, pp. 520-521, Aug. 1970. \hookleftarrow

G. R. Price, Extension of covariance selection mathematics, Annals of Human Genetics, vol. 35, no. 4, pp. 485-490, Apr. 1972. \hookleftarrow

G. R. Price, The nature of selection, Journal of Theoretical Biology, vol. 175, no. 3, pp. 389-396, Aug. 1995. (written ca. 1971 and published posthumously) \hookleftarrow

H. A. Simon, The architecture of complexity, Proceedings of the American Philosophical Society, vol. 106, no. 6, pp. 467–482, 1962. 1\hookleftarrow^1 2\hookleftarrow^2

H. A. Simon, Near decomposability and the speed of evolution, Industrial and Corporate Change, vol. 11, no. 3, pp. 587-599, Jun. 2002. 1\hookleftarrow^1 2\hookleftarrow^2 3\hookleftarrow^3

J. Maynard Smith and E. Szathmáry, The major transitions in evolution. New York: Oxford University Press, USA, 1995. \hookleftarrow

Blog - hierarchical organization and biological evolution (part 3)

category: blog, biology