# Experimental observables / dynamical fingerprints¶

Frank Noefrank.noe@fu-berlin.de
FU Berlin, Arnimallee 6, 14195 Berlin, Germany

 Summary: Markov models can be used to compute kinetic experimental observables, such as time-dependent relaxation or correlation functions. A useful way to represent such data is in terms of dynamical fingerprints, i.e. spectra that map relaxation timescales and corresponding amplitudes in the experimental signal. Different experimental observables of the same molecular system are expected to have fingerprints with identical timescales but different amplitudes. The fingerprint thus shows the experimental characteristics under the specific experimental observation.

In experimental studies of protein folding, the conformational dynamics is mapped onto an observable $A(\mathbf{x}):\Omega\rightarrow\mathbb{R}$ which is measured. $A(\mathbf{x})$ could be a fluorescence or transfer efficiency in a fluorescence experiment, the chemical shift in an NMR experiment, the intensity of a given spectral peak in an IR experiment, the distance in a pulling experiment, and so forth. In the following we use a discretized version of $A(\mathbf{x})$ denoted by the vector $\mathbf{a}$, which assigns a scalar value to every discrete state $i$. Usually, $a_{i}$ is the ensemble average of $A(\mathbf{x})$ over the configurations belonging to state $i$, $\Omega_{i}$: $$a_{i}=\frac{1}{\pi_{i}}\int_{\mathbf{x}\in\Omega_{i}}d\mathbf{x}\:\mu(\mathbf{x})\, A(\mathbf{x}) \:\:\:\:(1)$$ We note that vector- or function-valued observables (such as entire spectra in IR or NMR data) could be treated in a similar way, although this is not done here. Given the observable vector, various experimental measurements can be expressed as derived in [3] and [2].

In equilibrium experiments, the observed molecule is in equilibrium with the current conditions of the surroundings (temperature, applied forces, salt concentration etc.), and the mean value of an observable a, $$\mathbb{E}[A]=\int_{\mathbf{x}\in\Omega}d\mathbf{x}\:\mu(\mathbf{x})\: A(\mathbf{x}) \:\:\:\:(2)$$ is recorded. This may be either done my measuring $\mathbb{E}[A]$ directly from an unperturbed ensemble of molecules, or by recording sufficiently many and long single molecule traces $A(t)=a(\mathbf{x}_{t})$ and averaging over them. The discrete approximation to the average is

\begin{aligned} \mathbb{E}_{\mathbf{\pi}}[a] & = & \sum_{i=1}^{n}a_{i}\pi_{i}=\left\langle \mathrm{\mathbf{a},\boldsymbol{\pi}}\right\rangle \:. \:\:\:\:(3)\end{aligned} where $\mathbb{E}\left[a\right]$ denotes the expectation value of a the discrete observable $a$ and $\langle\mathbf{x},\mathbf{y}\rangle$ denotes the scalar product between two vectors $\mathbf{x}$ and $\mathbf{y}$. Since $\boldsymbol{\pi}$ is the eigenvector to eigenvalue 1 of the transition matrix $\mathbf{T}(\tau)$, it can easily be calculated from the MSM. $\mathbb{E}_{\pi}[a]$ does not depend on time and therefore bears no kinetic information.

Kinetic information is available through time-correlation experiments. These may be realized by taking trajectories from time-resolved single molecule experiments, such as single molecule fluorescence or pulling experiments, and computing time correlations from these trajectories. The exact time autocorrelation function at time $k\tau$ can be computed for an ergodic system either via the time or ensemble average: \begin{aligned} \mathbb{E}[A(t)A(t+k\tau)] & = & \lim_{t_{max}\rightarrow\infty}\frac{1}{t_{max}}\int_{t}^{t_{max}-k\tau}A(t)A(t+k\tau)\\ & = & \int_{\mathbf{x}}d\mathbf{x}\int_{\mathbf{y}}d\mathbf{y}\, A(\mathbf{x}_{t})\,\mu(\mathbf{x}_{t})\, p(\mathbf{x},\mathbf{y};\:\tau)\, A(\mathbf{x}_{t+k\tau})\\ & = & \mathbb{E}_{\mu}[A(\mathbf{x}_{t})A(\mathbf{x}_{t+k\tau})]. \:\:\:\:(4)\end{aligned} Given a partition into discrete states, this quantity can be approximated by: \begin{aligned} \mathbb{E}\left[a(t)\: a(t+k\tau)\right] & = & \sum_{i=1}^{n}\sum_{j=1}^{n}a_{i}\mathbb{P}(x_{t}=i)\,\mathbb{P}(x_{t+k\tau}=j\mid x_{t}=i)\, a_{j}. \:\:\:\:(5)\end{aligned} The terms under the summation signs contain the product the signal in state $i$ and the signal in state $j$, $a_{i}a_{j}$, where $a_{i}$ is weighted by the probability of finding the system in state $S_{i}$, and $a_{j}$ is weighted by the conditional probability of finding the system in state $j$ given that it has been in state $i$ at $k$ timesteps $\tau$ earlier. In equilibrium, the former probability is given by the equilibrium probability $\pi$. Assuming that the process is Markovian, the latter probability is given by the transition matrix element of the corresponding transition matrix. Eq. 5 can be rewritten as a matrix equation in which $\mathbf{T}(\tau)$ appears explicitly \begin{aligned} \mathbb{E}\left[a(t)\: a(t+k\tau)\right] & = & \sum_{i=1}^{n}\sum_{j=1}^{n}a_{i}\pi_{i}\left[\mathbf{T}^{k}(\tau)\right]_{ij}a_{j}=\mathbf{a}^{\top}\Pi\mathbf{T}^{k}(\tau)\mathbf{a}. \:\:\:\:(6)\end{aligned} Replacing $\mathbf{T}^{k}(\tau)$ by its spectral decomposition, one obtains

\begin{aligned} \mathbb{E}\left[a(t)\: a(t+k\tau)\right] & = & \mathbf{a}^{\top}\left[\sum_{i=1}^{n}\exp\left(-\frac{k\tau}{t_{i}}\right)\mathbf{l}_{i}\mathbf{l}_{i}^{T}\right]\mathbf{a}\\ & = & \langle\mathbf{a},\boldsymbol{\pi}\rangle^{2}+\sum_{i=2}^{n}\exp\left(-\frac{k\tau}{t_{i}}\right)\langle\mathbf{a},\mathbf{l}_{i}\rangle^{2}\:.\nonumber \:\:\:\:(7)\end{aligned} Likewise, cross-correlation functions can be approximated as

$$\mathbb{E}\left[a(t)\: b(t+k\tau)\right]=\langle\mathbf{a},\boldsymbol{\pi}\rangle\langle\mathbf{b},\boldsymbol{\pi}\rangle+\sum_{i=2}^{n}\exp\left(-\frac{k\tau}{t_{i}}\right)\langle\mathbf{a},\mathbf{l}_{i}\rangle\langle\mathbf{b},\mathbf{l}_{i}\rangle\:. \:\:\:\:(8)$$ Eq. 7 and 8 have the form of a multiexponential decay function $$f(t)=\gamma_{1}^{\mathrm{corr}}+\sum_{i=2}\gamma_{i}^{\mathrm{corr}}\exp\left(-\frac{t}{t_{i}}\right)\:, \:\:\:\:(9)$$ with amplitudes $$\gamma_{i}^{\mathrm{corr}}=\langle\mathbf{a},\mathbf{l}_{i}\rangle\langle\mathbf{b},\mathbf{l}_{i}\rangle\:. \:\:\:\:(10)$$ Each of the amplitudes is associated with an eigenvector of the transition matrix and the decay constant $t_{i}$ is the implied time scale of this eigenvector, $t_{i}=-\tau/\ln\lambda_{i}$.

Alternatively, relaxation experiments can be used to probe the molecules’ kinetics. In these experiments, the system is allowed to relax from a nonequilibrium starting state with probability distribution $p_{0}(\mathbf{x})$, and the time-dependent ensemble average of some observable $A$ is being measured: $$\mathbb{E}[A(k\tau)]=\int_{\mathbf{y}\in\Omega}d\mathbf{y}\, p_{k\tau}(\mathbf{y})\, A(\mathbf{y})=\int_{\mathbf{y}\in\Omega}d\mathbf{y}\int_{\mathbf{x}\in\Omega}d\mathbf{x}\, p_{0}(\mathbf{x})\, p(\mathbf{x},\mathbf{y};\, k\tau)\, A(\mathbf{y}) \:\:\:\:(11)$$ Examples are temperature-jump, pressure-jump, or pH-jump experiments, rapid mixing experiments, or experiments where measurement at $t=0$ starts from a synchronized starting state, such as in processes that are started by an external trigger like a photoflash.

In a Markov model, $A(k\tau)$ can be approximated by the using the discrete vector $\mathbf{a}$ and the transition matrix $\mathbf{T}(\tau)$ with stationary distribution $\boldsymbol{\pi}\neq\mathbf{p}_{0}$. The ensemble average $\mathbb{E}_{\boldsymbol{p}_{0}}[a(t)]$ is recorded while the system relaxes from the initial distribution $\mathbf{p}_{0}$ to the new equilbrium distribution $\pi$. The expectation value of the signal at time $t=k\tau$ depends on the current probability distribution $\mathbf{p}_{k\tau}$ and is given by

\begin{aligned} \mathbb{E}_{\boldsymbol{p}_{0}}[a(k\tau)] & = & \sum_{i=1}^{n}a_{i}p_{k\tau,i}=\langle\mathbf{a,}\mathbf{p}_{k\tau}\rangle. \:\:\:\:(12)\end{aligned} Eq. 12is analogous to Eq. 8. $\mathbf{p}(k\tau)$ evolves under the influence of the transition matrix $\mathbf{T}(\tau)$. Using the spectral decomposition of $\mathbf{T}(\tau)$ and expressing $\lambda_{i}^{k}$ via implied timescales $t_{i}$, we obtain

\begin{aligned} \mathbb{E}_{\boldsymbol{p}_{0}}[a(k\tau)] & = & \langle\mathbf{p}'_{0},\pi\rangle\langle\mathbf{a},\pi\rangle+\sum_{i=2}^{n}\exp\left(-\frac{k\tau}{t_{i}}\right)\langle\mathbf{p}'_{0},\mathbf{l}_{i}\rangle\langle\mathbf{a},\mathbf{l}_{i}\rangle\: \:\:\:\:(13)\end{aligned} where $\mathbf{p}'(0)$ is the *excess probability distribution $\mathbf{p}'(0)=\Pi^{-1}\mathbf{p}(0)$. $\mathbb{E}_{\boldsymbol{p}_{0}}[a(k\tau)]$* is again a multiexponential decay function with amplitudes

\begin{aligned} \gamma_{i}^{\mathrm{relax}} & = & \langle\mathbf{p}'_{0},\mathbf{l}_{i}\rangle\langle\mathbf{a},\mathbf{l}_{i}\rangle\:. \:\:\:\:(14)\end{aligned} A summary of the amplitudes of various types of experiments is given in table 1.

equilibrium correlation experiment
relaxation experiment $\gamma_{i}^{\mbox{relax}}=\left\langle \mathbf{a},\mathbf{l}_{i}\right\rangle \left\langle \mathbf{p}'^{\top}(0),\:\mathbf{l}_{i}\right\rangle$
autocorrelation experiment $\gamma_{i}^{\mbox{eq, auto-cor}}=\left\langle \mathbf{a},\mathbf{l}_{i}\right\rangle ^{2}$
cross-correlation experiment $\gamma_{i}^{\mbox{eq, cross-cor}}=\left\langle \mathbf{a},\mathbf{l}_{i}\right\rangle \left\langle \mathbf{b},\mathbf{l}_{i}\right\rangle$

Table 1: Overview of the expressions for the amplitudes in equilibrium kinetics experiments. These equations are useful to calculate based on simulations which processes a given experiment will be sensitive to. To illustrate this, consider again the protein folding model and let us consider three different observables. In observable A, we measure the formation of structure element $a$, i.e. $a=1$ for states in which $a$ is formed while $a=0$ for states in which $a$ is not formed. Likewise observables B and C measure the formation of structure elements $b$ and $c$. This can be realized e.g. with a fluorophor and a specific quencher at appropriate positions [1]. We also consider three ways of measuring each of these three constructs, namely temperature jump experiments at three different temperatures from $0.15$ to $0.2$, from $0.6$ to $0.65$, and from $2.4$ to $2.45$. We calculate the amplitude that is in the slowest and second-slowest processes and report the normalized results in Table 2.

It is apparent that the processes that can be measured drastically depends on the way the measurement is done and the observable used. For example, at high temperatures, all observables yield nearly single-exponential kinetics with the timescale of moving between the unfolded state and the partially structured state. At low temperature, the kinetics may appear biexponential, provided that measurement noise is sufficiently small, with the main amplitude being in the formation of $a$ ($\gamma_{2}$) and $c$ ($\gamma_{3}$).

Obs A Obs B Obs C
T-Jump $0.15\rightarrow0.20$ $\gamma_{2}$ 0.71 0.19 0.13
$\gamma_{3}$ 0.29 0.81 0.87
T-Jump $0.60\rightarrow0.65$ $\gamma_{2}$ 0.94 0.89 0.17
$\gamma_{3}$ 0.06 0.11 0.83
T-Jump $2.40\rightarrow2.45$ $\gamma_{2}$ 0.98 0.95 0.89
$\gamma_{3}$ 0.02 0.05 0.11

Table 2: Normalized amplitudes of the slowest and second-slowest processes of simulated temperature-jump experiments of the folding model The combination of Markov models and the spectral theory given is useful to compare simulations and experiments via the dynamical fingerprint representation of the system kinetics [3]. Furthermore, this approach permits to design experiments that are optimal to probe individual relaxations [3].

#### Citing Fingerprints:¶

The idea of dynamical fingerprints and the equations shown above were originally published in [3]. The formalism was extended in [2].

## Bibliography¶

[1]: Doose, Sören, Neuweiler, Hannes and Sauer, Markus: Fluorescence Quenching by Photoinduced Electron Transfer: A Reporter for Conformational Dynamics of Macromolecules. ChemPhysChem 10, 1389-1398 (2009).

[2]: B. Keller, J.-H. Prinz and F. Noé: Markov models and dynamical fingerprints: Unraveling the complexity of molecular kinetics. Chem. Phys. 396, 92-107 (2012).

[3]: F. Noé, S. Doose, I. Daidone, M. Löllmann, J. D. Chodera, M. Sauer and J. C. Smith: Dynamical fingerprints for probing individual relaxation processes in biomolecular dynamics with simulations and kinetic experiments. Proc. Natl. Acad. Sci. USA 108, 4822-4827 (2011).

[4]: J.-H. Prinz, B. Keller and F. Noé: Probing molecular kinetics with Markov models: Metastable states, transition pathways and spectroscopic observables. Phys. Chem. Chem. Phys. 13, 16912-16927 (2011).