The Azimuth Project
Blog - network theory (part 23)

This page is a blog article in progress, written by John Baez. To see discussions of this article while it is being written, visit the Azimuth Forum.

We’ve been looking at reaction networks, and we’re getting ready to find equilibrium solutions of the equations they give. To do this, we’ll need to connect them to another kind of network we’ve studied. A reaction network is something like this:

It’s a bunch of complexes, which are sums of basic building-blocks called species, together with arrows called transitions going between the complexes. If we know a number for each transition describing the rate at which it occurs, we get an equation called the ‘rate equation’. This describes how the amount of each species changes with time. We’ve been talking about this equation ever since the start of this series! Last time, we wrote it down in a new very compact form:

dxdt=YHx Y \displaystyle{ \frac{d x}{d t} = Y H x^Y }

But now suppose we forget how each complex is made of species! Suppose we just think of them as abstract things in their own right, like numbered boxes:

We can use these boxes to describe states of some system. The arrows still describe transitions, but now we think of these as ways for the system to hop from one state to another. Say we know a number for each transition describing the probability per time at which it occurs:

Then we get a ‘Markov process’—or in other words, a random walk where our system hops from one state to another. If ψ\psi is the probability distribution saying how likely the system is to be in each state, this Markov process is described by this equation:

dψdt=Hψ \displaystyle{ \frac{d \psi}{d t} = H \psi }

This is simpler than the rate equation, because it’s linear. But the operator HH is the same—we’ll see that explicitly later on today.

What’s the point? Well, our ultimate goal is to prove the deficiency zero theorem, which gives equilibrium solutions of the rate equation. That means finding xx with

YHx Y=0 Y H x^Y = 0

Today we’ll find all equilibria for the Markov process, meaning all ψ\psi with

Hψ=0 H \psi = 0

Then next time we’ll show some of these have the form

ψ=x Y \psi = x^Y

So, we’ll get

Hx Y=0 H x^Y = 0

and thus

YHx Y=0 Y H x^Y = 0

as desired!

So, let’s get to to work.

The Markov process of a graph with rates

We’ve been looking at stochastic reaction networks, which are things like this:

However, we can build a Markov process starting from just part of this information:

Let’s call this thing a ‘graph with rates’, for lack of a better name. We’ve been calling the things in KK ‘complexes’, but now we’ll think of them as ‘states’. So:

Definition. A graph with rates consists of:

• a finite set of states KK,

• a finite set of transitions TT,

• a map r:T(0,)r: T \to (0,\infty) giving a rate constant for each transition,

source and target maps s,t:TKs,t : T \to K saying where each transition starts and ends.

Starting from this, we can get a Markov process describing how a probability distribution ψ\psi on our set of states will change with time. As usual, this Markov process is described by a master equation:

dψdt=Hψ \displaystyle{ \frac{d \psi}{d t} = H \psi }

for some Hamiltonian:

H: K K H : \mathbb{R}^K \to \mathbb{R}^K

What is this Hamiltonian, exactly? Let’s think of it as a matrix where H ijH_{i j} is the probability per time for our system to hop from the state ii to the state jj. (This looks backwards, but don’t blame me—blame the guys who invented the usual conventions for matrix algebra.) Clearly if iji \ne j this probability per time should be the sum of the rate constants of all transitions going from jj to ii:

ijH ij= τ:jir(τ) i \ne j \quad \Rightarrow \quad H_{i j} = \sum_{\tau: j \to i} r(\tau)

We saw in Part 11 that for a probability distribution to remain a probability distribution as it evolves in time according to the master equation, we need HH to be infinitesimal stochastic: its off-diagonal entries must be nonnegative, and the sum of the entries in each column must be zero.

The first condition holds already, and the second one tells us what the diagonal entries must be. So, we’re basically done describing HH. But we can summarize it this way:

Puzzle. Think of K\mathbb{R}^K as the vector space consisting of finite linear combinations of elements κK\kappa \in K. Then

Hκ= s(τ)=κr(τ)(t(τ)s(τ)) H \kappa = \sum_{s(\tau) = \kappa} r(\tau) (t(\tau) - s(\tau))

Equilibrium solutions of the master equation

Now we’ll classify equilibrium solutions of the master equation, meaning ψ K\psi \in \mathbb{R}^K with

Hψ=0 H \psi = 0

We’ll do only do this when our graph with rates is ‘weakly reversible’. This concept doesn’t actually depend on the rates, so let’s be general and say:

Definition. A graph is weakly reversible if for every edge τ:ij\tau : i \to j, there is directed path going back from jj to ii, meaning that we have edges

τ 1:jj 1,τ 2:j 1j 2,,τ n:j n1i \tau_1 : j \to j_1 , \quad \tau_2 : j_1 \to j_2 , \quad \dots, \quad \tau_n: j_{n-1} \to i

This graph with rates is not weakly reversible:

but this one is:

The good thing about the weakly reversible case is that we get one equilibrium solution of the master equation for each component of our graph, and all equilibrium solutions are linear combinations of these. This is not true in general! For example, this guy is not weakly reversible:

It has only one component, but two linearly independent solutions: one that vanishes except at the state 0, and one that vanishes except at the state 2.

The idea of a ‘component’ is supposed to be fairly intuitive—our graph falls apart into pieces called components—but we should make it precise. As explained in Part 21, the graphs we’re using here are directed multigraphs, meaning things like

s,t:EV s, t : E \to V

where EE is the set of edges (our transitions) and VV is the set of vertices (our states). There are actually two famous concepts of ‘component’ for graphs of this sort: ‘strongly connected’ components and ‘connected’ components. We only need connected components, but let me explain both concepts, in a feeble attempt to slake your insatiable thirst for knowledge.

Two vertices ii and jj of a graph lie in the same strongly connected component iff you can find a directed path of edges from ii to jj and also one from jj back to ii.

Remember, a directed path from ii to jj looks like this:

iabcj i \to a \to b \to c \to j

Here’s a path from xx to yy is not directed:

iabcj i \to a \leftarrow b \to c \to j

and I hope you can write down the obvious but tedious definition of an ‘undirected path’, meaning a path made of edges that don’t necessarily point in the correct direction. Given that, we say two vertices ii and jj lie in the same connected component iff you can find an undirected path going from ii to jj. In this case, there will automatically also be an undirected path going from jj to ii.

For example, ii and jj lie in the same connected component here, but not the same strongly connected component:

iabcj i \to a \leftarrow b \to c \to j

Here’s a graph with one connected component and 3 strongly connected components, which are marked in blue:

For the theory we’re looking at now, we only care about connected components, not strongly connected components! However:

Puzzle 2. Show that for weakly reversible graph, the connected components are the same as the strongly connected components.

With these definitions out of way, we can state today’s big theorem:

Theorem. Suppose HH is the Hamiltonian of a weakly reversible graph with rates:

Then for each connected component CKC \subseteq K, there exists a unique probability distribution ψ C K\psi_C \in \mathbb{R}^K that is positive on that component, zero elsewhere, and is an equilibrium solution of the master equation:

Hψ C=0 H \psi_C = 0

Moreover, these probability distributions ψ C\psi_C form a basis for the space of equilibrium solutions of the master equation. So, the dimension of this space is the number of components of KK.

Proof. We start by assuming our graph has one connected component. We use the Perron–Frobenius theorem, as explained in Part 20. This applies to ‘nonnegative’ matrices, meaning those whose entries are all nonnegative. That is not true of HH itself, but only its diagonal entries can be negative, so if we choose a large enough number c>0c > 0, H+cIH + c I will be nonnegative.

Since our graph is weakly reversible and has one connected component, it follows straight from the definitions that the operator H+cIH + c I will also be ‘irreducible’ in the sense of Part 20. The Perron–Frobenius theorem then swings into action, and we instantly conclude several things.

First, H+cIH + c I has a positive real eigenvalue rr such that any other eigenvalue, possibly complex, has absolute value r\le r. Second, there is an eigenvector ψ\psi with eigenvalue rr and all positive components. Third, any other eigenvector with eigenvalue rr is a scalar multiple of ψ\psi.

Subtracting cc, it follows that λ=rc\lambda = r - c is the eigenvalue of HH with the largest real part. We have Hψ=λψH \psi = \lambda \psi, and any other vector with this property is a scalar multiple of ψ\psi.

We can show that in fact λ=0\lambda = 0. To do this we copy an argument from Part 20. First, since ψ\psi is positive we can normalize it to be a probability distribution:

iKψ i=1 \displaystyle{ \sum_{i \in K} \psi_i = 1 }

Since HH is infinitesimal stochastic, exp(tH)\exp(t H) sends probability distributions to probability distributions:

iK(exp(tH)ψ) i=1 \displaystyle{ \sum_{i \in K} (\exp(t H) \psi)_i = 1 }

for all t0t \ge 0. On the other hand,

iK(exp(tH)ψ) i= iKe tλψ i=e tλ \sum_{i \in K} (\exp(t H)\psi)_i = \sum_{i \in K} e^{t \lambda} \psi_i = e^{t \lambda}

so we must have λ=0\lambda = 0.

We conclude that when our graph has one connected component, there is a probability distribution ψ K\psi \in \mathbb{R}^K that is positive everywhere and has Hψ=0H \psi = 0. Moreover, any ϕ K\phi \in \mathbb{R}^K with Hϕ=0H \phi = 0 is a scalar multiple of ψ\psi.

When KK has several components, the matrix HH is block diagonal, with one block for each component. So, we can run the above argument on each component CKC \subseteq K and get a probability distribution ψ C K\psi_C \in \mathbb{R}^K that is positive on CC. We can then check that Hψ C=0H \psi_C = 0 and that every ϕ K\phi \in \mathbb{R}^K with Hϕ=0H \phi = 0 can be expressed as a linear combination of these probability distributions ψ C\psi_C in a unique way.   █

This result must be absurdly familiar to people who study Markov processes, but I haven’t bothered to look up a reference yet. Do you happen to know a good one? I’d like to see one that generalizes this theorem to graphs that aren’t weakly reversible. I think I see how it goes. We don’t need that generalization right now, but it would be good to have around.

The Hamiltonian, revisited

One last small piece of business: last time I showed you a very slick formula for the Hamiltonian HH. I’d like to prove it agrees with the formula I gave this time.

We start with any graph with rates:

We extend ss and tt to linear maps between vector spaces:

We define the boundary operator just as we did last time:

=ts \partial = t - s

Then we put an inner product on the vector spaces T\mathbb{R}^T and K\mathbb{R}^K. So, for K\mathbb{R}^K we let the elements of KK be an orthonormal basis, but for T\mathbb{R}^T we define the inner product in a more clever way involving the rate constants:

τ,τ=1r(τ)δ τ,τ \langle \tau, \tau' \rangle = \frac{1}{r(\tau)} \delta_{\tau, \tau'}

where τ,τT\tau, \tau' \in T. This lets us define adjoints of the maps s,ts, t and \partial, via formulas like this:

s ϕ,ψ=ϕ,sψ \langle s^\dagger \phi, \psi \rangle = \langle \phi, s \psi \rangle


Theorem. The Hamiltonian for a graph with rates is given by

H=s H = \partial s^\dagger

Proof. It suffices to check that this formula agrees with the formula for HH given in the Puzzle:

Hκ= s(τ)=κr(τ)(t(τ)s(τ)) H \kappa = \sum_{s(\tau) = \kappa} r(\tau) (t(\tau) - s(\tau))

Recall that here we are using the complex latexκKlatex \kappa \in K as a name for one of the standard basis vectors of latex Klatex \mathbb{R}^K. Similarly shall we write things like τ\tau or τ\tau' for basis vectors of T\mathbb{R}^T.

First, we claim that

s κ= τ:s(τ)=κr(τ)τ s^\dagger \kappa = \sum_{\tau: s(\tau) = \kappa} r(\tau) \, \tau

To prove this it’s enough to check that taking the inner products of either sides with any basis vector τ\tau', we get results that agree. On the one hand:

τ,s κ = sτ,κ = δ s(τ),κ \begin{array}{ccl} \langle \tau' , s^\dagger \kappa \rangle &=& \langle s \tau', \kappa \rangle \\ \\ &=& \delta_{s(\tau'), \kappa} \end{array}

On the other hand:

τ, τ:s(τ)=κr(τ)τ = τ:s(τ)=κr(τ)τ,τ = τ:s(τ)=κδ τ,τ = δ s(τ),κ \begin{array}{ccl} \displaystyle{ \langle \tau', \sum_{\tau: s(\tau) = \kappa} r(\tau) \, \tau \rangle } &=& \sum_{\tau: s(\tau) = \kappa} r(\tau) \, \langle \tau', \tau \rangle \\ &=& \sum_{\tau: s(\tau) = \kappa} \delta_{\tau', \tau} \\ &=& \delta_{s(\tau'), \kappa} \end{array}

where the factor of 1/r(τ)1/r(\tau) in the inner product on T\mathbb{R}^T cancels the visible factor of r(τ)r(\tau). So indeed the results match.

Using this formula for s κs^\dagger \kappa we now see that

Hκ = s κ = τ:s(τ)=κr(τ)τ = τ:s(τ)=κr(τ)(t(τ)s(τ)) \begin{array}{ccl} H \kappa &=& \partial s^\dagger \kappa \\ &=& \partial \displaystyle{ \sum_{\tau: s(\tau) = \kappa} r(\tau) \, \tau } \\ &=& \displaystyle{ \sum_{\tau: s(\tau) = \kappa} r(\tau) \, (t(\tau) - s(\tau)) } \end{array}

which is precisely what we want.   █

category: blog