Mitochondrial network state scales mtDNA genetic dynamics

Mitochondrial DNA (mtDNA) mutations cause severe congenital diseases but may also be associated with healthy aging. MtDNA is stochastically replicated and degraded, and exists within organelles which undergo dynamic fusion and fission. The role of the resulting mitochondrial networks in the time evolution of the cellular proportion of mutated mtDNA molecules (heteroplasmy), and cell-to-cell variability in heteroplasmy (heteroplasmy variance), remains incompletely understood. Heteroplasmy variance is particularly important since it modulates the number of pathological cells in a tissue. Here, we provide the first wide-reaching theoretical framework which bridges mitochondrial network and genetic states. We show that, under a range of conditions, the (genetic) rate of increase in heteroplasmy variance and de novo mutation are proportionally modulated by the (physical) fraction of unfused mitochondria, independently of the absolute fission-fusion rate. In the context of selective fusion, we show that intermediate fusion/fission ratios are optimal for the clearance of mtDNA mutants. Our findings imply that modulating network state, mitophagy rate and copy number to slow down heteroplasmy dynamics when mean heteroplasmy is low could have therapeutic advantages for mitochondrial disease and healthy aging.


Introduction
Mitochondrial DNA (mtDNA) encodes elements of the respiratory system vital for cellular function. Mutation of mtDNA is one of several leading hypotheses for the cause of normal aging (Kauppila et al., 2017;López-Otín et al., 2013), as well as underlying a number of heritable mtDNA-related diseases (Schon et al., 2012). Cells typically contain hundreds, or thousands, of copies of mtDNA per cell: each molecule encodes crucial components of the electron transport chain, which generates energy for the cell in the form of ATP. Consequently, the mitochondrial phenotype of a single cell is determined, in part, by its fluctuating population of mtDNA molecules (Aryaman et al., 2019;Johnston, 2018;Stewart and Chinnery, 2015;Wallace and Chalkia, 2013). The broad biomedical implications of mitochondrial DNA mutation, combined with the countable nature of mtDNAs and the stochastic nature of their dynamics, offer the opportunity for mathematical understanding to provide important insights into human health and disease (Aryaman et al., 2019).
An important observation in mitochondrial physiology is the threshold effect, whereby cells may often tolerate relatively high levels of mtDNA mutation, until the fraction of mutated mtDNAs (termed heteroplasmy) exceeds a certain critical value where a pathological phenotype occurs (Aryaman et al., 2017;Picard et al., 2014;Rossignol et al., 2003;Stewart and Chinnery, 2015). Fluctuations within individual cells mean that the fraction of mutant mtDNAs per cell is not constant within a tissue ( Figure 1A), but follows a probability distribution which changes with time ( Figure 1B). Here, motivated by a general picture of aging, we will largely focus on the setting of non-dividing cells, which possess two mtDNA variants (although we will also consider de novo mutation using simple statistical genetics models). The variance of the distribution of heteroplasmies gives the fraction of cells above a given pathological threshold ( Figure 1B). Therefore heteroplasmy variance is related to the number of dysfunctional cells above a phenotypic threshold within a tissue, and both heteroplasmy mean and variance are directly related to tissue physiology. Increases in heteroplasmy variance also increase the number of cells below a given threshold heteroplasmy, which can be advantageous in e.g. selecting low-heteroplasmy embryos in pre-implantation genetic diagnosis for treating mitochondrial disease (Burgstaller et al., 2014b;Johnston et al., 2015).
Mitochondria exist within a network which dynamically fuses and fragments. Although the function of mitochondrial networks remains an open question (Hoitzing et al., 2015), it is often thought that a combination of network dynamics and mitochondrial autophagy (termed mitophagy) act in concert to perform quality control on the mitochondrial population (Aryaman et al., 2019;Johnston, 2018;Twig et al., 2008). Observations of pervasive intra-mitochondrial mtDNA mutation (Morris et al., 2017) and universal heteroplasmy in humans (Payne et al., 2012) suggest that the power of this quality control may be limited. It has also been suggested that certain mtDNA mutations, such as deletions (Kowald and Kirkwood, 2018 and some point mutations (Li et al., 2015;Lieber et al., 2019;Samuels et al., 2013;Ye et al., 2014), are under the influence of selective effects. However, genetic models without selection have proven valuable in explaining the heteroplasmy dynamics both of functional mutations (Elson et al., 2001;Taylor et al., 2003;Wonnapinij et al., 2008) and polymorphisms without dramatic functional consequences (Birky et al., 1983;Ye et al., 2014), and in common cases where mean heteroplasmy shifts are small compared to changes in variances (for instance, in germline development (Johnston et al., 2015) and post-mitotic tissues (Burgstaller et al., 2014a)). Mean changes seem more likely in high-turnover tissues and when mtDNA variants are genetically distant (Burgstaller et al., 2014a;Pan et al., 2019), suggesting that neutral genetic theory may be useful in understanding the dynamics of the set of functionally mild mutations which accumulate during ageing. Neutral genetic theory also provides a valuable null model for understanding mitochondrial genetic dynamics (Chinnery and Samuels, 1999;Johnston and Jones, 2016;Poovathingal et al., 2009), potentially allowing us to better understand and quantify when selection is present. There is thus a set of open questions about how the physical dynamics of mitochondria affect the genetic populations of mtDNA within and between cells under neutral dynamics.
A number of studies have attempted to understand the impact of the mitochondrial network on mitochondrial dysfunction through computer simulation (reviewed in Kowald and Klipp (2014)). These studies have suggested: that clearance of damaged mtDNA can be assisted by high and funcitonally-selective mitochondrial fusion, or by intermediate fusion and selective mitophagy (Mouli et al., 2009); that physical transport of mitochondria can indirectly modulate mitochondrial health through mitochondrial dynamics (Patel et al., 2013); that fission-fusion dynamic rates modulate a trade-off between mutant proliferation and removal (Tam et al., 2013(Tam et al., , 2015; and that if fission is damaging, decelerating fission-fusion cycles may improve mitochondrial quality (Figge et al., 2012).
Despite providing valuable insights, these previous attempts to link mitochondrial genetics and network dynamics, while important for breaking ground, have centered around complex computer simulations, making it difficult to deduce general laws and principles. Here, we address this lack of a general theoretical framework linking mitochondrial dynamics and genetics. We take a simpler approach in terms of our model structure ( Figure 1C), allowing us to derive explicit, interpretable, mathematical formulae which provide intuitive understanding, and give a direct account for the phenomena which are observed in our model ( Figure 1D). Our results hold for a range of variant model structures. Simplified approaches using stochastic modelling have shown success in understanding mitochondrial physiology from a purely genetic perspective (Capps et al., 2003;Chinnery and Samuels, 1999;Johnston and Jones, 2016). Furthermore, there currently exists limited evidence for pronounced, universal, selective differences of mitochondrial variants in vivo (Hoitzing, 2017;Stewart and Larsson, 2014). Our basic approach therefore also differs from previous modelling attempts, since our model is neutral with respect to genetics (no replicative advantage or selective mitophagy) and the mitochondrial network (no selective fusion). Evidence for negative selection of particular mtDNA mutations has been observed in vivo (Morris et al., 2017;Ye et al., 2014); we therefore extend our analysis to explore selectivity in the context of mitochondrial quality control using our simplified framework.
Here, we reveal the first general mathematical principle linking (physical) network state and (genetic) heteroplasmy statistics ( Figure 1D). Our models potentially allow rich interactions between mitochondrial genetic and network dynamics, yet we find that a simple link emerges. For a broad range of situations, the expansion of mtDNA mutants is strongly modulated by network state, such that the rate of increase of Figure 1. A simple model bridging mitochondrial networks and genetics yields a wide-reaching, analytically obtained, description of heteroplasmy variance dynamics. (A) A population of cells from a tissue exhibit inter-cellular heterogeneity in mitochondrial content: both mutant load (heteroplasmy) and copy number. (B) Inter-cellular heterogeneity implies that heteroplasmy is described by a probability distribution. Cells above a threshold heteroplasmy (h * , black dashed line) are thought to exhibit a pathological phenotype. The low-variance distribution (black line) has fewer cells above a pathological threshold heteroplasmy than the high-variance distribution (red line). Heteroplasmy is depicted as an approximately normal distribution, as this is the regime in which our approximations below hold: i.e. when the probability of fixation is small. (C) The chemical reaction network we use to model the dynamics of mitochondrial DNA (see Main Text for a detailed description). MtDNAs are assigned a genetic state: mutant (M ) or wild-type (W ), and a network state: singleton (i.e. unfused, S) or fused (F ). (D) The central result of our work is, assuming that a cell at time t = 0 is at its (deterministic) steady-state, heteroplasmy variance (V(h)) approximately increases with time (t), mitophagy rate (µ) and the fraction of mitochondria that are unfused (fs), and decreases with mtDNA copy number (n). Importantly, V(h) does not depend on the absolute magnitude of the fission-fusion rates. Also see Table S1 for a summary of our key findings. heteroplasmy variance, and the rate of accumulation of de novo mutation, is proportional to the fraction of unfused mitochondria. We discover that this result stems from the general notion that fusion shields mtDNAs from turnover, since autophagy of large fragments of the mitochondrial network are unlikely, and consequently rescales time. Importantly, we used our model for network dynamics to show that heteroplasmy variance is independent of the absolute magnitude of the fusion and fission rates due to a separation of timescales between genetic and network processes (in contrast to Tam et al. (2015)). Surprisingly, we find the dependence of heteroplasmy statistics upon network state arises when the mitochondrial population size is controlled through replication, and vanishes when it is controlled through mitophagy, shedding new light on the physiological importance of the mode of mtDNA control. We show that when fusion is selective, intermediate fusion/fission ratios are optimal for the clearance of mutated mtDNAs (in contrast to Mouli et al. (2009)). When mitophagy is selective, complete fragmentation of the network results in the most effective elimination of mitochondrial mutants (in contrast to Mouli et al. (2009)). We also confirm that mitophagy and mitochondrial DNA copy number affect the rate of accumulation of de novo mutations (Johnston and Jones, 2016), see Table S1 for a summary of our key findings. We suggest that pharmacological interventions which promote fusion, slow mitophagy and increase copy number earlier in development may slow the rate of accumulation of pathologically mutated cells, with implications for mitochondrial disease and aging.

Materials and Methods
Stochastic modelling of the coupling between genetic and network dynamics of mtDNA populations Our modelling approach takes a chemical master equation perspective by combining a general model of neutral genetic drift (for instance, see Chinnery and Samuels (1999); Johnston and Jones (2016)) with a model of mitochondrial network dynamics. We seek to understand the influence of the mitochondrial network upon mitochondrial genetics. The network state itself is influenced by several factors including metabolic poise and the respiratory state of mitochondria (Hoitzing et al., 2015;Mishra and Chan, 2016;Szabadkai et al., 2006), which we do not consider explicitly here. We consider the existence of two mitochondrial alleles, wild-type (W ) and mutant (M ), existing within a post-mitotic cell without cell division, with mtDNAs undergoing turnover (or "relaxed replication" Stewart and Chinnery (2015)). MtDNAs exist within mitochondria, which undergo fusion and fission. We therefore assign mtDNAs a network state: fused (F ) or unfused (we term "singleton", S). This representation of the mitochondrial network allows us to include the effects of the mitochondrial network in a simple way, without the need to resort to a spatial model or consider the precise network structure, allowing us to make analytic progress and derive interpretable formulae in a more general range of situations.
Our model can be decomposed into three notional blocks ( Figure 1C). Firstly, the principal network processes denote fusion and fission of mitochondria containing mtDNAs of the same allele where X denotes either a wild-type (W ) or a mutant (M ) mtDNA (therefore a set of chemical reactions analogous to Eq. (1)-(3) exist for both DNA species). γ and β are the stochastic rate constants for fusion and fission respectively. Secondly, mtDNAs are replicated and degraded through a set of reactions termed genetic processes. A central assumption is that all degradation of mtDNAs occur through mitophagy, and that only small pieces of the mitochondrial network are susceptible to mitophagy; for parsimony we take the limit of only the singletons being susceptible to mitophagy where λ and µ are the replication and mitophagy rates respectively, which are shared by both W and M resulting in a so-called 'neutral' genetic model. Eq. (6) denotes removal of the species from the system. The effect of allowing non-zero degradation of fused species is discussed in Supporting Information (see Eq. (S68) and Figure S3E). Replication of a singleton changes the network state of the mtDNA into a fused species, since replication occurs within the same membrane-bound organelle. An alternative model of singletons which replicate into singletons, thereby associating mitochondrial replication with fission (Lewis et al., 2016), leaves our central result ( Figure 1D) unchanged (see Supporting Information, Eq. (S67)). The system may be considered neutral since both W and M possess the same replication and degradation rates per molecule of mtDNA at any instance in time.
Finally, mtDNAs of different genotypes may interact through fusion via a set of reactions we term network cross-processes: Any fusion or fission event which does not involve the generation or removal of a singleton leaves our system unchanged; we term such events as non-identity-changing processes, which can be ignored in our system (see Supporting Information, Rate renormalization for a discussion of rate renormalization). We have neglected de novo mutation in the model description above (although we will consider de novo mutation using a modified infinite sites Moran model below). We found that treating λ = const led to instability in total copy number (see Supporting Information, Constant rates yield unstable copy numbers for a model describing mtDNA genetic and network dynamics), which is not credible. We therefore favoured a state-dependent replication rate such that copy number is controlled to a particular value, as has been done by previous authors (Capps et al., 2003;Chinnery and Samuels, 1999;Johnston and Jones, 2016). Allowing lower-case variables to denote the copy number of their respective molecular species, we will focus on a linear replication rate of the form (Hoitzing, 2017;Hoitzing et al., 2017): where w T = w s + w f is the total wild-type copy number, and similarly for m T . The lower-case variables w s , w f , m s , and m f denote the copy numbers of the corresponding chemical species (W S , W F , M S , and M F ). b is a parameter which determines the strength with which total copy number is controlled to a target copy number, and κ is a parameter which is indicative of (but not equivalent to) the steady state copy number. δ indicates the relative contribution of mutant mtDNAs to the control strength and is linked to the "maintenance of wild-type" hypothesis (Durham et al., 2007;Stewart and Chinnery, 2015). When 0 ≤ δ < 1, and both mutant and wild-type species are present, mutants have a lower contribution to the birth rate than wild-types. When wild-types are absent, the population size will be larger than when there are no mutants: hence mutants have a higher carrying capacity in this regime. We have modelled the mitophagy rate as constant per mtDNA. We do, however, explore relaxing this constraint below by allowing mitophagy to be a function of state, and also affect mutants differentially under quality control. λ may be re-written as λ = k 1 + k 2 w T + k 3 m T for constants k i , and so only consists of 3 independent parameters. However we will retain λ in the form of Eq. (10) since the parameters µ, b, κ, and δ have the distinct physiological meanings described above (Hoitzing, 2017;Hoitzing et al., 2017). Furthermore, λ may in general also depend on other cellular features such as mitochondrial reactive oxygen species. Here, we seek to explain mitochondrial behaviour under a simple set of governing principles, but our approach can naturally be combined with a description of these additional factors to build a more comprehensive model. Analogues of this model (without a network) have been applied to mitochondrial systems (Capps et al., 2003;Chinnery and Samuels, 1999). Overall, our simple model consists of 4 species (W S , W F , M S , M F ), 6 independent parameters and 15 reactions, and captures the central property that mitochondria fragment before degradation (Twig et al., 2008). Throughout this work, we define heteroplasmy as the mutant allele fraction per cell of a mitochondriallyencoded variant (Aryaman et al., 2019;Samuels et al., 2010;Wonnapinij et al., 2008): where x = (w s , w f , m s , m f ) is the state of the system (not to be confused with mitochondrial "respiratory states"). Hence, a heteroplasmy of h = 1 denotes a cell with 100% mutant mtDNA (i.e. a homoplasmic cell in the mutant allele). Arguably, "mutant allele fraction" would be a more precise description of Eq.(11) but we retain the use of heteroplasmy for consistency. To convert to a definition of heteroplasmy which is maximal when the mutant allele fraction is 50%, one may simply use the conversion 0.5 − |h(x) − 0.5|.

Statistical Analysis
In Figures S3B, S4A-I, we compare Eq. (13) and Eq. (S72) to stochastic simulations, for various parametrizations and replication/degradation rates. To quantify the accuracy of these equations in predicting V(h, t), we define the following error metric whereV(h, t) is the time derivative of heteroplasmy variance with subscripts denoting theory (Th) and simulation (Sim). An expectation over time (E t ) is taken for the stochastic simulations, whereasV(h, t) is a scalar quantity for Eq. (13) and Eq. (S72).

Data Availability
Code for simulations and analysis can be accessed at https://GitHub.com/ImperialCollegeLondon/ MitoNetworksGenetics

Results
Mitochondrial network state rescales the linear increase of heteroplasmy variance over time, independently of fission-fusion rate magnitudes We first performed a deterministic analysis of the system presented in Eqs.
(1)-(10), by converting the reactions into an analogous set of four coupled ordinary differential equations (see Eqs. (S29)-(S32)), and choosing a biologically-motivated approximate parametrization (which we will term the 'nominal' parametrization, see Supporting Information, Choice of nominal parametrization, and Table S2). Figures 2A-B show that copy numbers of each individual species change in time such that the state approaches a line of steady states (Eqs. (S34)-(S36)), as seen in other neutral genetic models (Capps et al., 2003;Hoitzing, 2017). Upon reaching this line, total copy number remains constant ( Figure S2A) and the state of the system ceases to change with time. This is a consequence of performing a deterministic analysis, which neglects stochastic effects, and our choice of replication rate in Eq. (10) which decreases with total copy number when w T + δm T > κ and vice versa, guiding the total population to a fixed total copy number. Varying the fission (β) and fusion (γ) rates revealed a negative linear relationship between the steady-state fraction of singletons and copy number ( Figure S2B). We may also simulate the system in Eqs.
(1)-(9) stochastically, using the stochastic simulation algorithm (Gillespie, 1976), which showed that mean copy number is slightly perturbed from the deterministic prediction due to the influence of variance upon the mean (Grima et al., 2011; Hoitzing, 2017) ( Figure 2C). The stationarity of total copy number is a consequence of using δ = 1 for our nominal parametrization (i.e. the line of steady states is also a line of constant copy number). Choosing δ = 1 results in a difference in carrying capacities between the two species, and non-stationarity of mean total copy number, as trajectories spread along the line of steady states to different total copy numbers. Copy number variance initially increases since trajectories are all initialised at the same state, but plateaus because trajectories are constrained in their copy number to remain near the attracting line of steady states ( Figure S3A). Mean heteroplasmy remains constant through time under this model ( Figure 2D, see (Birky et al., 1983)). This is unsurprising since each species possesses the same replication and degradation rate, so neither species is preferred.
From stochastic simulations we observed that, for sufficiently short times, heteroplasmy variance increases approximately linearly through time for a range of parametrizations ( Figure 2E-H), which is in agreement with recent single-cell oocyte measurements in mice (Burgstaller et al., 2018). Previous work has also shown a linear increase in heteroplasmy variance through time for purely genetic models of mtDNA dynamics (see Johnston and Jones (2016)). We sought to understand the influence of mitochondrial network dynamics upon the rate of increase of heteroplasmy variance.
To this end, we analytically explored the influence of mitochondrial dynamics on mtDNA variability. Assuming that the state of the system above is initialised at its deterministic steady state (x(t = 0) = x ss ), we took the limit of limit of large mtDNA copy numbers, fast fission-fusion dynamics, and applied a second-order truncation of the Kramers-Moyal expansion (Gardiner, 1985) to the chemical master equation describing the dynamics of the system (see Supporting Information). This yielded a stochastic differential equation for Wild-type and mutant copy numbers (A) and fused and unfused copy numbers (B) both move towards a line of steady states under a deterministic model, as indicated by arrows. In stochastic simulation, mean copy number (C) is initially slightly perturbed from the deterministic treatment of the system, and then remains constant, while mean heteroplasmy (D) remains invariant with time (see Eq. (S61)). In (E)-(H), we show that Eq. (13) holds across many cellular circumstances: lines give analytic results, points are from stochastic simulation. Heteroplasmy variance behaviour is successfully predicted for varying mitophagy rate (E), steady state copy number (F), mutation sensing (G), and fusion rate (H). In (H), fusion and fission rates are redefined as γ → γ0M R and β → β0M where M and R denote the relative magnitude and ratio of the network rates, and γ0, β0 denote the nominal parametrizations of the fusion and fission rates respectively (see Table S2). Figure S3D shows a sweep of M over the same logarithmic range when R = 1. See Figure S4A-I and Table S3 for parameter sweeps numerically demonstrating the generality of the result for different mtDNA control modes.
heteroplasmy, via Itô's formula (Jacobs, 2010). Upon forcing the state variables onto the steady-state line (Constable et al., 2016), we derived Eq. (S63), which may be approximated for sufficiently short times as Here, V(h) is the variance of heteroplasmy, µ is the mitophagy rate, n(x) is the total copy number and f s (x) is the fraction of unfused (singleton) mtDNAs, and is thus a measure of the fragmentation of the mitochondrial network. x ss is the (deterministic) steady state of the system. Eq. (13) demonstrates that mtDNA heteroplasmy variance increases approximately linearly with time (t) at a rate scaled by the fraction of unfused mitochondria, mitophagy rate, and inverse population size. We find that Eq. (13) closely matches heteroplasmy variance dynamics from stochastic simulation, for sufficiently short times after initialisation, for a variety of parametrizations of the system ( Figure 2E-H, Figure S5). To our knowledge, Eq. (13) reflects the first analytical principle linking mitochondrial dynamics and the cellular population genetics of mtDNA variance. Its simple form allows several intuitive interpretations. As time progresses, replication and degradation of both species occurs, allowing the ratio of species to fluctuate; hence we expect V(h) to increase with time according to random genetic drift ( Figure 2E-H). The rate of occurrence of replication/degradation events is set by the mitophagy rate µ, since degradation events are balanced by replication rates to maintain population size; hence, random genetic drift occurs more quickly if there is a larger turnover in the population ( Figure 2E). We expect V(h) to increase more slowly in large population sizes, since the birth of e.g. 1 mutant in a large population induces a small change in heteroplasmy ( Figure 2F). The factor of h(1 − h) encodes the state-dependence of heteroplasmy variance, exemplified by the observation that if a cell is initialised at h = 0 or h = 1, heteroplasmy must remain at its initial value (since the model above does not consider de novo mutation, see below) and so heteroplasmy variance is zero. Furthermore, the rate of increase of heteroplasmy variance is maximal when a cell's initial value of heteroplasmy is 1/2. In Figure 2G, we show that Eq. (13) is able to recapitulate the rate of heteroplasmy variance increase across different values of δ, which are hypothesized to correspond to different replicative sensing strengths of different mitochondrial mutations (Hoitzing, 2017). We also show in Figures  In Eq. (6), we have made the important assumption that only unfused mitochondria can be degraded via mitophagy, as seen by Twig et al. (2008), hence the total propensity of mtDNA turnover is limited by the number of mtDNAs which are actually susceptible to mitophagy. Strikingly, we find that the dynamics of heteroplasmy variance are independent of the absolute rate of fusion and fission, only depending on the fraction of unfused mtDNAs at any particular point in time (see Figure 2H and Figure S3D). This observation, which contrasts with the model of (Tam et al., 2013(Tam et al., , 2015 (see Discussion), arises from the observation that mitochondrial network dynamics are much faster than replication and degradation of mtDNA, by around a factor of β/µ ≈ 10 3 (see Table S2), resulting in the existence of a separation of timescales between network and genetic processes. In the derivation of Eq. (13), we have assumed that fission-fusion rates are infinite, which simplifies V(h) into a form which is independent of the magnitude of the fission-fusion rate. A parameter sweep of the magnitude and ratio of the fission-fusion rates reveals that, if the fusion and fission rates are sufficiently small, Eq. (13) breaks down and V(h) gains dependence upon the magnitude of these rates (see Figure S4A). This regime is, however, for network rates which are approximately 100 times smaller than the biologically-motivated nominal parametrization shown in Figure 2A-D where the fission-fusion rate becomes comparable to the mitophagy rate. Since fission-fusion takes place on a faster timescale than mtDNA turnover, we may neglect this region of parameter space as being implausible.
Eq. (13) can be viewed as describing the "quasi-stationary state" where the probability of extinction of either allele is negligible (Johnston and Jones, 2016). On longer timescales, or if mtDNA half-life is short (Poovathingal et al., 2012), the probability of fixation becomes appreciable. In this case, Eq. (13) over-estimates V(h) as heteroplasmy variance gradually becomes sub-linear with time, see Figure S5C&D. This is evident through inspection of Eq. (S63), which shows that cellular trajectories which reach h = 0 or h = 1 cease to diffuse in heteroplasmy space, and so heteroplasmy variance cannot increase indefinitely. Consequently, the depiction of heteroplasmy variance in Fig. 1B,D as being approximately normally distributed corresponds to the regime in which our approximation holds, and is a valid subset of the behaviours displayed by heteroplasmy dynamics under more sophisticated models (e.g. the Kimura distribution Kimura (1955); Wonnapinij et al.
( 2008)). Further analytical developments may be possible to take into account extinction (e.g. see Assaf and Meerson (2010); Wonnapinij et al. (2008)). However, the linear regime for heteroplasmy variance has been observed to be a substantial component of mtDNA dynamics in e.g. mouse oocytes (Burgstaller et al., 2018).

The influence of mitochondrial dynamics upon heteroplasmy variance under different models of genetic mtDNA control
To demonstrate the generality of this result, we explored several alternative forms of cellular mtDNA control (Johnston and Jones, 2016). We found that when copy number is controlled through the replication rate function (i.e. λ = λ(x), µ = const), when the fusion and fission rates were high and the fixation probability (P (h = 0) or P (h = 1)) was negligible, Eq. (13) accurately described V(h) across all of the replication rates investigated, see Figure S4A-F. The same mathematical argument to show Eq. (13) for the replication rate in Eq. (10) may be applied to these alternative replication rates where a closed-form solution for the deterministic steady state may be written down (see Supporting Information, Deriving an ODE description of the mitochondrial network system). Interestingly, when copy number is controlled through the degradation rate (i.e. λ =const, µ = µ(x)), heteroplasmy variance loses its dependence upon network state entirely and the f s term is lost from Eq. (13) (see Eq. (S72) and Figure S4G-I). A similar mathematical argument was applied to reveal how this dependence is lost (see Supporting Information, Proof of heteroplasmy relation for linear feedback control).
In order to provide an intuitive account for why control in the replication rate, versus control in the degradation rate, determines whether or not heteroplasmy variance has network dependence, we investigated a time-rescaled form of the Moran process (see Supporting Information, A modified Moran process may account for the alternative forms of heteroplasmy variance dynamics under different models of genetic mtDNA control). The Moran process is structurally much simpler than the model presented above, to the point of being unrealistic, in that the mitochondrial population size is constrained to be constant between consecutive time steps. Despite this, the modified Moran process proved to be insightful. We find that, when copy number is controlled through the replication rate, the absence of death in the fused subpopulation means the timescale of the system (being the time to the next death event) is proportional to f s . In contrast, when copy number is controlled through the degradation rate, the presence of a constant birth rate in the entire population means the timescale of the system (being the time to the next birth event) is independent of f s (see Eq. (S84) and surrounding discussion).

Control strategies against mutant expansions
In this study, we have argued that the rate of increase of heteroplasmy variance, and therefore the rate of accumulation of pathologically mutated cells within a tissue, increases with mitophagy rate (µ), decreases with total mtDNA copy number per cell (n) and increases with the fraction of unfused mitochondria (termed "singletons", f s ), see Eq. (13). Below, we explore how biological modulation of these variables influences the accumulation of mutations. We use this new insight to propose three classes of strategy to control mutation accumulation and hence address associated issues in aging and disease, and discuss these strategies through the lens of existing biological literature.

Targeting network state against mutant expansions
In order to explore the role of the mitochondrial network in the accumulation of de novo mutations, we invoked an infinite sites Moran model (Kimura, 1969) (see Figure 3A). Single cells were modelled over time as having a fixed mitochondrial copy number (n), and at each time step one mtDNA is randomly chosen for duplication and one (which can be the same) for removal. The individual replicated incurs Q de novo mutations, where Q is binomially distributed according to where Binomial(N, p) is a binomial random variable with N trials and probability p of success. L mtDNA = 16569 is the length of mtDNA in base pairs and η = 5.6 × 10 −7 is the mutation rate per base pair per doubling (Zheng et al., 2006); hence each base pair is idealized to have an equal probability of mutation upon replication. In Supporting Information, Eq. (S83), we argue that when population size is controlled in the replication rate, the inter-event rate (Γ) of the Moran process is effectively rescaled by the fraction of unfused mitochondria, i.e. Γ = µnf s , which we apply here.  Figure 3B shows that in the infinite sites model, the consequence of Eq. (S83) is that the rate of accumulation of mutations per cell reduces as the mitochondrial network becomes more fused, as does the mean number of mutations per mtDNA ( Figure 3C). These observations are intuitive: since fusion serves to shield the population from mitophagy, mtDNA turnover slows down, and therefore there are fewer opportunities for replication errors to occur per unit time. Different values of f s in Figures 3B&C therefore correspond to a rescaling of time i.e. stretching of the time-axis. The absolute number of mutations predicted in Figure 3B may over-estimate the true number of mutations per cell (and of course depends on our choice of mutation rate), since a subset of mutations will experience either positive or negative selection. However, quantification of the number of distinct mitochondrial mutants in single cells remains under-explored, as most mutations will have a variant allele fraction close to 0% or 100% (Birky et al., 1983), which are challenging to measure, especially through bulk sequencing.
A study by Chen et al. (2010) observed the effect of deletion of two proteins which are involved in mitochondrial fusion (Mfn1 and Mfn2) in mouse skeletal muscle. Although knock-out studies present difficulties in extending their insights into the physiological case, the authors observed that fragmentation of the mitochondrial network induced severe depletion of mtDNA copy number (which we also observed in Figure S2B). Furthermore, the authors observed that the number of mutations per base pair increased upon fragmentation, which we also observed in the infinite sites model where fragmentation effectively results in a faster turnover of mtDNA ( Figure 3C).
Our models predict that promoting mitochondrial fusion has a two-fold effect: firstly, it slows the increase of heteroplasmy variance (see Eq. (13) and Figure 2H); secondly, it reduces the rate of accumulation of distinct mutations (see Figure 3B&C). These two effects are both a consequence of mitochondrial fusion rescaling the time to the next turnover event, and therefore the rate of random genetic drift. As a consequence, this simple model suggests that promoting fusion earlier in development (assuming mean heteroplasmy is low) could slow down the accumulation and spread of mitochondrial mutations, and perhaps slow aging.
If we assume that fusion is selective in favour of wild-type mtDNAs, which appears to be the case at least for some mutations under therapeutic conditions (Kandul et al., 2016;Suen et al., 2010), we predict that a balance between fusion and fission is the most effective means of removing mutant mtDNAs (see below), perhaps explaining why mitochondrial networks are often observed to exist as balanced between mitochondrial fusion and fission (Sukhorukov et al., 2012;Zamponi et al., 2018). In contrast, if selective mitophagy pathways are induced then promoting fragmentation is predicted to accelerate the clearance of mutants (see below).

Targeting mitophagy rate against mutant expansions
Alterations in the mitophagy rate µ have a comparable effect to changes in f s in terms of reducing the rate of heteroplasmy variance (see Eq. (13)) and the rate of de novo mutation ( Figure 3B&C) since they both serve to rescale time. Our theory therefore suggests that inhibition of basal mitophagy may be able to slow down the rate of random genetic drift, and perhaps healthy aging, by locking-in low levels of heteroplasmy. Indeed, it has been shown that mouse oocytes (Boudoures et al., 2017) as well as mouse hematopoietic stem cells (de Almeida et al., 2017) have comparatively low levels of mitophagy, which is consistent with the idea that these pluripotent cells attempt to minimise genetic drift by slowing down mtDNA turnover. A previous modelling study has also shown that mutation frequency increases with mitochondrial turnover (Poovathingal et al., 2009).
Alternatively, it has also been shown that the presence of heteroplasmy, in genotypes which are healthy when present at 100%, can induce fitness disadvantages (Acton et al., 2007;Bagwan et al., 2018;Sharpley et al., 2012). In cases where heteroplasmy itself is disadvantageous, especially in later life where such mutations may have already accumulated, accelerating heteroplasmy variance increase to achieve fixation of a species could be advantageous. However, this will not avoid cell-to-cell variability, and the physiological consequences for tissues of such mosaicism is unclear.

Targeting copy number against mutant expansions
To investigate the role of mtDNA copy number (mtCN) on the accumulation of de novo mutations, we set f s = 1 such that Γ = µn (i.e. a standard Moran process). We found that varying mtCN did not affect the mean number of mutations per molecule of mtDNA ( Figure 3C, inset). However, as the population size becomes larger, the total number of distinct mutations increases accordingly ( Figure 3D). In contrast to our predictions, a recent study by Wachsmuth et al. (2016) found a negative correlation between mtCN and the number of distinct mutations in skeletal muscle. However, Wachsmuth et al. (2016) also found a correlation between the number of distinct mutations and age, in agreement with our model. Furthermore, the authors used partial regression to find that age was more explanatory than mtCN in explaining the number of distinct mutations, suggesting age as a confounding variable to the influence of copy number. Our work shows that, in addition to age and mtCN, turnover rate and network state also influence the proliferation of mtDNA mutations. Therefore, one would ideally account for these four variables for jointly, in order to fully constrain our model.
A study of single neurons in the substantia nigra of healthy human individuals found that mtCN increased with age (Dölle et al., 2016). Furthermore, mice engineered to accumulate mtDNA deletions through faulty mtDNA replication (Trifunovic et al., 2004) display compensatory increases in mtCN (Perier et al., 2013), which potentially explains the ability of these animals to resist neurodegeneration. It is possible that the observed increase in mtCN in these two studies is an adaptive response to slow down random genetic drift (see Eq. (13)). In contrast, mtCN reduces with age in skeletal muscle (Wachsmuth et al., 2016), as well as in a number of other tissues such as pancreatic islets (Cree et al., 2008) and peripheral blood cells (Mengel-From et al., 2014). Given the beneficial effects of increased mtCN in neurons, long-term increases in mtCN could delay other age-related pathological phenotypes.

Optimal mitochondrial network configurations for mitochondrial quality control
Whilst the above models of mtDNA dynamics are neutral (i.e. m and w share the same replication and degradation rates), it is often proposed that damaged mitochondria may experience a higher rate of degradation (Kim et al., 2007;Narendra et al., 2008). There are two principal ways in which selection may occur on mutant species. Firstly, mutant mitochondria may be excluded preferentially from the mitochondrial network in a background of unbiased mitophagy. If this is the case, mutants would be unprotected from mitophagy for longer periods of time than wild-types, and therefore be at greater hazard of degradation. We can alter the fusion rate (γ) in the mutant analogues of Eq. (1),(2) and Eqs. (7)-(9) by writing for all fusion reactions involving 1 or more mutant mitochondria where f > 0. The second potential selective mechanism we consider is selective mitophagy. In this case, the degradation rate of mutant mitochondria is larger than wild-types, i.e. we modify the mutant degradation reaction to for m > 0. In these two settings, we explore how varying the fusion rate for a given selectivity ( f and m ) affects the extent of reduction in mean heteroplasmy. Figure 4A shows that, in the context of selective fusion ( f > 0) and non-selective mitophagy ( m = 0) the optimal strategy for clearance of mutants is to have an intermediate fusion/fission ratio. This was observed for all fusion selectivities investigated (see Figure S7) Intuitively, if the mitochondrial network is completely fused then, due to mitophagy only acting upon smaller mitochondrial units, mitophagy cannot occur -so mtDNA turnover ceases and heteroplasmy remains at its initial value. In contrast, if the mitochondrial network completely fissions, there is no mitochondrial network to allow the existence of a quality control mechanism: both mutants and wild-types possess the same probability per unit time of degradation, so mean heteroplasmy does not change. Since both extremes result in no clearance of mutants, the optimal strategy must be to have an intermediate fusion/fission ratio.
In contrast, in Figure 4B, in the context of non-selective fusion ( f = 0) and selective mitophagy ( m > 0), the optimal strategy for clearance of mutants is to completely fission the mitochondrial network. Intuitively, if mitophagy is selective, then the more mtDNAs which exist in fragmented organelles, the greater the number of mtDNAs which are susceptible to selective mitophagy, the greater the total rate of selective mitophagy, the faster the clearance of mutants.

Discussion
In this work, we have sought to unify our understanding of three aspects of mitochondrial physiology -the mitochondrial network state, mitophagy, and copy number -with genetic dynamics. The principal virtue of our modelling approach is its simplified nature, which makes general, analytic, quantitative insights available for the first time. In using parsimonious models, we are able to make the first analytic link between the  (15)), for each value of fusion selectivity ( f ), the fusion rate (γ) was varied relative to the nominal parametrization (see Table S2). When f > 0, the largest reduction in mean heteroplasmy occurs at intermediate values of the fusion rate; a deterministic treatment reveals this to be true for all fusion selectivities investigated (see Figure S7). (B) For selective mitophagy (see Eq. (16)), when mitophagy selectivity m > 0, a lower mean heteroplasmy is achieved, the lower the fusion rate (until mean heteroplasmy = 0 is achieved). Hence, complete fission is the optimal strategy for selective mitophagy. mitochondrial network state and heteroplasmy dynamics. This is in contrast to other computational studies in the field, whose structural complexity make analytic progress difficult, and accounting for their predicted phenomena correspondingly more challenging.
Our bottom-up modeling approach allows for potentially complex interactions between the physical (network) and genetic mitochondrial states of the cell, yet a simple connection emerged from our analysis. We found, for a wide class of models of post-mitotic cells, that the rate of linear increase of heteroplasmy variance is modulated in proportion to the fraction of unfused mitochondria (see Eq. (13)). The general notion that mitochondrial fusion shields mtDNAs from turnover, and consequently serves to rescale time, emerges from our analysis. This rescaling of time only holds when mitochondrial copy numbers are controlled through a state-dependent replication rate, and vanishes if copy numbers are controlled through a state-dependent mitophagy rate. We have presented the case of copy number control in the replication rate as being a more intuitive model than control in the degradation rate. The former has the interpretation of biogenesis being varied to maintain a constant population size, with all mtDNAs possessing a characteristic lifetime. The latter has the interpretation of all mtDNA molecules being replicated with a constant probability per unit time, regardless of how large or small the population size is, and changes in mitophagy acting to regulate population size. Such a control strategy seems wasteful in the case of stochastic fluctuations resulting in a population size which is too large, and potentially slow if fluctuations result in a population size which is too small. Furthermore, control in the replication rate means that the mitochondrial network state may act as an additional axis for the cell to control heteroplasmy variance ( Figure 2) and the rate of accumulation of de novo mutations ( Figure 3B&C). Single-mtDNA tracking through confocal microscopy in conjunction with mild mtDNA depletion could shed light on whether the probability of degradation per unit time per mtDNA varies when mtDNA copy number is perturbed, and therefore provide evidence for or against these two possible control strategies.
Our observations provide a substantial change in our understanding of mitochondrial genetics, as it suggests that the mitochondrial network state, in addition to mitochondrial turnover and copy number, must be accounted for in order to predict the rate of spread of mitochondrial mutations in a cellular population. Crucially, through building a model that incorporates mitochondrial dynamics, we find that the dynamics of heteroplasmy variance is independent of the absolute rate of fission-fusion events, since network dynamics occur approximately 10 3 times faster than mitochondrial turnover, inducing a separation of timescales. The independence of the absolute rate of network dynamics makes way for the possibility of gaining information about heteroplasmy dynamics via the mitochondrial network, without the need to quantify absolute fissionfusion rates (for instance through confocal micrographs to quantify the fraction of unfused mitochondria). By linking with classical statistical genetics, we find that the mitochondrial network also modulates the rate of accumulation of de novo mutations, also due to the fraction of unfused mitochondria serving to rescale time. We find that, in the context of mitochondrial quality control through selective fusion, an intermediate fusion/fission ratio is optimal due to the finite selectivity of fusion. This latter observation perhaps provides an indication for the reason why we observe mitochondrial networks in an intermediate fusion state under physiological conditions (Sukhorukov et al., 2012;Zamponi et al., 2018).
We have, broadly speaking, considered neutral models of mtDNA genetic dynamics. It is, however, typically suggested that increasing the rate of mitophagy promotes mtDNA quality control, and therefore shrinks the distribution of heteroplasmies towards 0% mutant (see Eq. (15) and Eq. (16)). If mitophagy is able to change mean heteroplasmy, then a neutral genetic model appears to be inappropriate, as mutants experience a higher rate of degradation. Stimulation of the PINK1/Parkin pathway has been shown to select against deleterious mtDNA mutations in vitro (Suen et al., 2010) and in vivo (Kandul et al., 2016), as has repression of the mTOR pathway via treatment with rapamycin (Dai et al., 2013;Kandul et al., 2016). However, the necessity of performing a genetic/pharmacological intervention to clear mutations via this pathway suggests that the ability of tissues to selectively remove mitochondrial mutants under physiological conditions is weak. Consequently, neutral models such as our own are useful in understanding how the distribution of heteroplasmy evolves through time under physiological conditions. Indeed, it has been recently shown that mitophagy is basal (McWilliams et al., 2016) and can proceed independently of PINK1 in vivo (McWilliams et al., 2018), perhaps suggesting that mitophagy has non-selective aspects -although this is yet to be verified conclusively.
We have paid particular attention to the case of post-mitotic tissues, since these tissues are important for understanding the role of mitochondrial mutations in healthy aging (Kauppila et al., 2017;Khrapko and Vijg, 2009). A typical rate of increase of heteroplasmy variance predicted by Eq.(13) given our nominal parametrization (Table S2) . This value accounts for the accumulation of heteroplasmy variance which is attributable to turnover of the mitochondrial population in a post-mitotic cell. However, in the most general case, cell division is also able to induce substantial heteroplasmy variance. For example, V (h)/t has been measured in model organism germlines to be approximately 9 × 10 −4 day −1 in Drosophila (Johnston and Jones, 2016; Solignac et al., 1987), 9 × 10 −4 day −1 in NZB/BALB mice (Johnston and Jones, 2016; Wai et al., 2008;Wonnapinij et al., 2008), and 2 × 10 −4 day −1 in single LE and HB mouse oocytes (Burgstaller et al., 2018). We see that these rates of increase in heteroplasmy variance are approximately an order of magnitude larger than predictions from our model of purely quiescent turnover, given our nominal parametrisation. Whilst larger mitophagy rates may also potentially induce larger values for V (h)/t (see Poovathingal et al. (2012), and Figure S5C, corrsponding to V (h)/t ≈ 3.5 × 10 −4 day −1 ) it is clear that partitioning noise (or "vegetative segregation", Stewart and Chinnery (2015)) is also an important source of variance in heteroplasmy dynamics (Johnston et al., 2015). Quantification of heteroplasmy variance in quiescent tissues remains an under-explored area, despite its importance in understanding healthy ageing (Aryaman et al., 2019;Kauppila et al., 2017).
Our findings reveal some apparent differences with previous studies which link mitochondrial genetics with network dynamics (see Table S4). Firstly, Tam et al. (2013Tam et al. ( , 2015 found that slower fission-fusion dynamics resulted in larger increases in heteroplasmy variance with time, in contrast to Eq. (13) which only depends on fragmentation state and not absolute network rates. The simulation approach of Tam et al. (2013Tam et al. ( , 2015 allowed for mitophagy to act on whole mitochondria, where mitochondria consist of multiple mtDNAs. Faster fission-fusion dynamics tended to form heteroplasmic mitochondria whereas slower dynamics formed homoplasmic mitochondria. It is intuitive that mitophagy of a homoplasmic mitochondrion induces a larger shift in heteroplasmy than mitophagy of a single mtDNA, hence slower network dynamics form more homoplasmic mitochondria. However, this apparent difference with our findings can naturally be resolved if we consider the regions in parameter space where the fission-fusion rate is much larger than the mitophagy rate, as is empirically observed to be the case (Burgstaller et al., 2014a;Cagalinec et al., 2013). If the fission-fusion rates are sufficiently large to ensure heteroplasmic mitochondria, then further increasing the fission-fusion rate is unlikely to have an impact on heteroplasmy dynamics. Hence, this finding is potentially compatible with our study, although future experimental studies investigating intra-mitochondrial heteroplasmy would help constrain these models. Tam et al. (2015) also found that fast fission-fusion rates could induce an increase in mean heteroplasmy, in contrast to Figure 2D which shows that mean heteroplasmy is constant with time after a small initial transient due to stochastic effects. We may speculate that the key difference between our treatment and that of Tam et al. (2013Tam et al. ( , 2015 is the inclusion of cellular subcompartments which induces spatial effects which we do not consider here. The uncertainty in accounting for the phenomena observed in such complex models highlights the virtues of a simplified approach which may yield interpretable laws and principles through analytic treatment. The study of Mouli et al. (2009) suggested that, in the context of selective fusion, higher fusion rates are optimal. This initially seems to contrast with our finding which states that intermediate fusion rates are optimal for the clearance of mutants ( Figure 4A). However, the high fusion rates in that study do not correspond directly to the highly fused state in our study. Fission automatically follows fusion in (Mouli et al., 2009), ensuring at least partial fragmentation, and the high fusion rates for which they identify optimal clearing are an order of magnitude lower than the highest fusion rate they consider. In the case of complete fusion, mitophagy cannot occur in the model of Mouli et al. In order to fully test our model, further single-cell longitudinal studies are required. For instance, the study by Burgstaller et al. (2018) found a linear increase in heteroplasmy variance through time in single oocytes. Our work here has shown that measurement of the network state, as well as turnover and copy number, are required to account for the rate of increase in heteroplasmy variance. Joint longitudinal measurements of f s , µ and n, with heteroplasmy quantification, would allow verification of Eq. (13) and aid in determining the extent to which neutral genetic models are explanatory. This could be achieved, for instance, using the mito-QC mouse (McWilliams et al., 2016) which allows visualisation of mitophagy and mitochondrial architecture in vivo. Measurement of f s , µ and n, followed by e.g. destructive single-cell whole-genome sequencing of mtDNA would allow validation of how µ, n and f s influence V(h) and the rate of de novo mutation (see Figure 3). One difficulty is sequencing errors induced through e.g. PCR, which hampers our ability accurately measure mtDNA mutation within highly heterogeneous samples (Woods et al., 2018). Morris et al. (2017) have suggested that single cells are highly heterogeneous in mtDNA mutation, with each mitochondrion possessing 3.9 single-nucleotide variants on average. Error correction strategies during sequencing may pave the way towards high-accuracy mtDNA sequencing (Salk et al., 2018;Woods et al., 2018), and allow us to better constrain models of heteroplasmy dynamics.

Constant rates yield unstable copy numbers for a model describing mtDNA genetic and network dynamics
We explored a simpler network system than the one presented in the Main Text, but found that it produced instability in mtDNA copy numbers, which we regard as biologically undesirable. Consider the following set of Poisson processes for singleton (s) and fused (f ) species where Eq. (S1)-(S3) are analogous to Eq. (1)- (3) where mutant species are neglected. Eq. (S4) and (S5) are simple birth processes with a shared constant rate αρ. Eq. (S6) and (S7) are simple death processes with rates ηρ and ρ respectively. The parameter ρ is shared amongst all of the birth and death reactions in Eqs. (S4)-(S7). ρ represents the intuitive assumption that, in order for a stable population size to exist, birth should balance death. However, for the network to have any effect at all, singletons should be at an increased risk of mitophagy relative to fused species. We represent the increased risk of singleton mitophagy with the parameter η. Since additional death is introduced into the system when η > 1, we include the parameter α > 1 as an increased global biogenesis rate to balance the increased mitophagy of singletons. We may write the above system as a set of ordinary differential equations ds dt = −γs 2 − γf s + βf + αρs − ηρs (S8) where we have enforced the stochastic reaction rate to be equivalent to the deterministic reaction rate, and hence the s 2 term is proportional to γ rather than 2γ (justification of this is presented below, see Eq. (S20)).
In Figure S1 we see that the system displays a trivial steady state at s = f = 0 and a non-trivial steady state. Computing the eigenvalues of the Jacobian matrix at the non-trivial steady state indicates that it is a saddle node, and therefore unstable. Initial copy numbers which are too small tend towards extinction with time, and initial copy numbers which are too large tend towards a copy number explosion. This simple example suggests that a system of this form with constant reaction rates is unstable, and therefore biologically unlikely to exist under reasonable circumstances. We hence consider analogous biochemical reaction networks with a replication rate which is a function of state, to prevent extinction and divergence of the total population size.

Conversion of a chemical reaction network into ordinary differential equations
The following section outlines the steps in converting a set of chemical reactions into a set of ordinary differential equations (ODEs). In particular, we pay special attention to the fact that the rate of a chemical reaction with a stochastic treatment is not always equivalent to the rate in a deterministic treatment (Wilkinson, 2011), as we will explain below. This subtlety is sometimes overlooked in the literature. This section draws on a number of standard texts (Gillespie, 1976(Gillespie, , 2007Van Kampen, 1992;Wilkinson, 2011) as well as Grima (2010). We hope this harmonized treatment will be of help as a future reference. Consider a general chemical system consisting of N distinct chemical species (X i ) interacting via R chemical reactions, where the j th reaction is of the form where s ij and r ij are stoichiometric coefficients. We definek j as the microscopic rate for this reaction. The dimensionality of this parameter will vary depending upon the stoichiometric coefficients s ij .k j may be loosely interpreted as setting the characteristic timescale (i.e. the cross section (Wilkinson, 2011)) of reaction j.
The chemical master equation (CME) describes the dynamics of the joint distribution of the state of the system and time, moving forwards through time. Defining the state of the system as x = (x 1 , . . . , x N ) T , where x i is the copy number of the i th species, allows us to write the CME as (Grima, 2010) where Ω is the volume of the compartment in which the reactions occur (also known as the system size), S ij = r ij − s ij is the stoichiometry matrix, and E −Sij i is referred to as the step operator and is defined through the relation E −Sij i (g(x)) = g(x 1 , . . . , x i − S ij , . . . , x N ), for any function of state g(x).f j (x, Ω) is the microscopic rate function of reaction j, which in general depends on both the state and the system size. A factor of Ω is explicitly included in this definition of the chemical master equation so that our treatment is compatible with Van Kampen's system size expansion (Van Kampen, 1992). As a consequence of this, the probability that, given the current state x, the j th reaction occurs in the time interval [t, t + dt) somewhere in Ω (Gillespie, 2007) isâ j (x, Ω)dt := Ωf j (x, Ω)dt.
(S12) a j (x, Ω) is termed the propensity function (or "hazard") and is of particular relevance in the stochastic simulation algorithm (Gillespie, 1976), sinceâ j (x, Ω)/ jâ j (x, Ω) determines the probability that the j th reaction occurs next. For the microscopic rate function, we may writê This equation counts the number of available combinations of reacting molecules (Gillespie, 1976;Wilkinson, 2011), whilst taking into account scaling with system size (Grima, 2010). We also introduce the deterministic rate equation (generally considered to be the macroscopic analogue of the CME) which is defined as (Grima, 2010;Van Kampen, 1992) where φ = (φ 1 , . . . , φ N ) T is the vector of macroscopic concentrations (of dimensions molecules per unit volume) andf j (φ) is the macroscopic rate function satisfying wherek j is the macroscopic rate for the j th reaction. We distinguish betweenk j andk j , respectively the rate constants for the discrete and continuous pictures, although this distinction is sometimes not emphasized in the literature (Grima, 2010;Grima et al., 2011;Van Kampen, 1992). The physical meaning ofk j is not immediately obvious: we argue that this parameter only gains physical meaning through the following procedure.
As stated by Wilkinson (2011), if we intend for the microscopic description in Eq. (S11) to correspond to the macroscopic description in Eq. (S14), the rate of consumption/production of particles for every reaction must be the same in the deterministic limit of the stochastic system (the conditions for which we define below). Therefore, we apply the following constraint in the limit of large copy numbers lim xi→∞f j (x, Ω) =f j (φ) ∀ i, j. (S16) In applying this constraint on all species i and all reactions j, we may derive a general relationship between We can make two approximations to generate a more convenient relationship between the microscopic and macroscopic rates. Firstly, we assume that This is a small noise approximation, since it is often assumed that x i = Ωφ i + Ω 1/2 ξ i , where ξ i is a noise term (Van Kampen, 1992). If ξ i is small then x i ≈ Ωφ i is a valid approximation. Secondly, we assume that This is a large copy number approximation: in the case of e.g. a bimolecular reaction (2X i → * ) with s ij = 2, the approximation is of the form By applying Eq. (S19) to the factor of x i ! in Eq. (S17), the factor of (x i − s ij )! cancels from the left-hand side. Simplifying using Eq. (S18), φ sij i cancels from both sides and we arrive at the important relationship With Eq. (S14), Eq. (S15) and Eq. (S20) one may therefore write down a set of ODEs for an arbitrary chemical reaction network, with constant reaction rates, in terms of the microscopic ratesk j . This equation highlights that for reactions with s ij ≥ 2,k j =k j , as is the case for bimolecular reactions of the form 2X i → * (see Eq. (1) and Eq. (S1)). Importantly, if the microscopic rate function is a function of state thenk =k(x) andk =k(φ) ≈k(x/Ω). In this case, Eq. (S20) still applies since the above argument assumed nothing about the particular forms ofk andk. However, additional factors of Ω −1 are induced by applying Eq. (S18), which may carry through to the individual parameters ofk(φ). A demonstration of this is given in the following section.

Deriving an ODE description of the mitochondrial network system
In this section we show how to derive an ODE description of the network system described in Eq. (1)-Eq. (9) in the Main Text. In accordance with the notation in the previous section, we will redefine all of the rates in Eq. (1)-Eq. (10) with a hat notation (â, for a general rate parameter a), to reflect that these are stochastic rates. Deterministic rates will be denoted with a tilde (ã). Our aim will be to write a set of ODEs in terms of the stochastic rates,â, for which we are able to estimate values.
We will begin by considering the fusion network equations Eq. (1) and Eq. (2). For clarity, we rewrite Eq. (1) to allow the reaction to proceed with some arbitrary rateρ: where X denotes either a wild-type (W ) or mutant (M ). We will subsequently fixρ to the rate of all other fusion reactionsγ. We do this because Eq. (1) is a bimolecular reaction involving one species: a fundamentally different reaction to bimolecular reactions involving two species, as we will now see. Sinceρ,γ = const, we may use Eq. (S20), resulting in the deterministic rates for Eq. (S21) and Eq.

Proof of heteroplasmy relation for linear feedback control
In this section we show that Eq. (13) holds for the system described by Eq. (1)-Eq. (9) given the replication rate in Eq. (10) using the Kramers-Moyal expansion under conditions of large copy number and fast network churn (to be defined below); the approach used here is similar to Constable et al. (2016). Consonant with the self-contained objectives of STAR methods, we draw together elements from the literature to provide a coherent derivation; we therefore hope that the following exposition may provide clarity for a wider audience.
Kramers-Moyal expansion of the chemical master equation for large copy numbers Customarily, the Kramers-Moyal expansion is formed using a continuous-space notation (Gardiner, 1985), so we will initially proceed in this way. Following the treatment by Gardiner (1985), we begin by re-writing the chemical master equation Eq. (S11) (CME) as where we have set Ω = 1. T (x|x ) is the transition rate from state x → x, and the dependence upon the initial condition has been suppressed for notational convenience. We now proceed by expanding the CME. The multivariate Kramers-Moyal expansion may be written as where H(x) is the Hessian matrix of T (x |x)P (x) (see (Gardiner, 1985) for a proof of this in the univariate case).
A transition to each possible neighbouring state x corresponds to some reaction j which moves the state from x → x . Since we know the influence of each reaction on state x through the constant stoichiometry matrix S ij , and that the propensity of a reaction does not depend upon x itself (see Eq. (S13)), we may transition from a notation involving x and x into a notation involving x and j. We may therefore define T j (x) := T (x |x) ≡f j (x) (see Eq. (S13)), and let H(x) → H j (x).
We now make a large copy number assumption in order to simplify T j (x). To take a large copy number limit, we assume that This approximation is exact when s ij = 0, 1, but inexact when s ij ≥ 2. For example, if we consider the secondorder bimolecular reaction in Eq. (1), Eq. (S40) is equivalent to assuming w 2 s ≈ w s (w s − 1); consequently, a factor of 1/(s ij ! ) = 1/2 arises in T j (x) as a combinatorial factor from stochastic considerations.

Fokker-Planck equation for chemical reaction networks
We now wish to re-write Eq. (S38) as a Fokker-Planck equation. Since the integral in Eq. (S38) is over x , and every x corresponds to a reaction j, we may interpret the integral in Eq. (S38) as a sum over all reactions, i.e. dx → R j=1 . Hence, for the j th reaction, [(x − x)] i = S ij . With these observations, we may write the first integral of Eq. (S38) as A is a vector of length N , [S] ij := r ij − s ij is the N × R stoichiometry matrix Eq. (S10), and T is the vector of transition rates, of length R (for which we have taken a large copy number approximation in Eq. (S40)).
To re-write the second integral of Eq. (S38), we write an element of the Hessian H j in Eq. (S39) as where j = 1, . . . , R and l, m = 1, . . . , N . Thus, we may write B is an N × N matrix, and Diag(Y) is a diagonal matrix whose main diagonal is the vector Y. We may therefore re-write Eq. (S38) as a Fokker-Planck equation for the state vector x of the form Fokker-Planck equation for an arbitrary function of state We now wish to make a change of variables in Eq. (S46) to write down a Fokker-Planck equation for an arbitrary scalar function of state x (which we will later set to be heteroplasmy). To do this, we wish to make use of Itô's formula, which allows a change of variables for an SDE. In general, the Fokker-Planck equation in Eq. (S46) is equivalent (Jacobs, 2010) to the following Itô stochastic differential equation (SDE) where GG T ≡ B (where G is an N × R matrix) and dW is a vector of independent Wiener increments of length R, and a Wiener increment dW satisfies Itô's formula states that, for an arbitrary function h(x, t) where x satisfies Eq. (S47), we may write the following SDE where H h (x) is the Hessian matrix of h(x, t) (see Eq. (S39), where T (x |x)P (x) should be replaced with h(x, t)). Given the form of B in Eq. (S45) we let which satisfies GG T ≡ B.
For convenience, we may also perform the transformation purely at the level of Fokker-Planck equations. Let h(x, t) satisfy the general Fokker-Planck equation for scalar functionsÃ(h, t) andB(h, t). Using the cyclic property of the trace in Eq. (S49), we may identifỹ where Tr is the trace operator. Also, from Eq. (S49), Hence, using Eq. (S51), Eq. (S52) and Eq. (S53), we may write down a Fokker-Planck equation for an arbitrary function of state in terms of A and B.
An SDE for heteroplasmy forced onto the steady state line in the high-churn limit It has been demonstrated that SDE descriptions of stochastic systems which possess a globally-attracting line of steady states may be formed in the long-time limit by forcing the state variables onto the steady state line (Constable et al., 2016;Parsons and Rogers, 2017). Such descriptions may be formed in terms of a parameter which traces out the position on the steady state line, hence reducing a high-dimensional problem into a single dimension (Constable et al., 2016;Parsons and Rogers, 2017). In our case, heteroplasmy is a suitable parameter to trace out the position on the steady state line. We seek to use similar reasoning to verify Eq. (13). In what follows, we will assume that x(t = 0) = x ss , where x ss is the state which is the solution of A = 0 (which is equivalent to finding the steady state solution of the deterministic rate equation in Eq. (S14) due to our assumption of large copy numbers and Ω = 1), so that we may neglect any deterministic transient dynamics.
Inspection of the steady state of the ODE description of our system reveals that the set of steady state solutions forms a line (see Eqs. (S34)-(S36)). Inspection of the steady state solution reveals that the steady state depends on the fusion (γ) and fission (β) rates. Mitochondrial network dynamics occur on a much faster timescale than the replication and degradation of mtDNA: the former occurring on the timescale of minutes (Twig et al., 2008) whereas the latter is hours or days (Johnston and Jones, 2016). We seek to use this separation of timescales to arrive at a simple form for V(h). We redefine the fusion and fission rates such that where M is a constant which determines the magnitude of the fusion and fission rates, which we call the "network churn". We now wish to use heteroplasmy as our choice for the function of state in the Fokker-Planck equation in Eq. (S51). We will first compute the diffusion termB for heteroplasmy using Eq. (S53). If we constrain the state x to be forced onto the steady state line x ss (as per (Constable et al., 2016;Parsons and Rogers, 2017)) in the high-churn limit, then upon defining Eq. (S57) is difficult to understand. In order to perform further simplification, we make an ansatz for the form ofB (B An ) and seek to determine whether our ansatz is equivalent to the derived form ofB under the constraints defined on the left-hand side of Eq. (S57). Our ansatz takes the form where f s (x) := (w s + m s )/(w s + w f + m s + m f ) and n(x) := w s + w f + m s + m f . Notice that this ansatz is more general than Eq. (S57), since it has no explicit dependence upon the parameters of the control law assumed in Eq. (10), and only explicitly depends upon functions of state x. Upon substituting the steady state x ss into the ansatz in Eq. (S58) and taking the high-churn limit, we find thatB which is equivalent to the following SDE for heteroplasmy in the limit of large network churn, large copy numbers, and a second-order truncation of the Kramers-Moyal expansion. Although the state has been forced onto the steady state, stochastic fluctuations mean that trajectories may move along the line of steady states, so the diffusion coefficient is not constant in general. We may calculate the new value of x ss (h) for every displacement due to Wiener noise in h, and substitute into f s (x) and n(x) to determine the diffusion coefficient at the next time step. However, for sufficiently short times, and large copy numbers (i.e. low diffusivity of h), we may assume that the diffusion coefficient in Eq. (S63) may be approximated as constant. Since the general solution of the SDE dy = where N (y|y 0 , σ 2 ) is a Gaussian distribution on y with mean y 0 and variance σ 2 , and y 0 = y(t = 0). Since we have assumed that the state is initialised at x(t = 0) = x ss , there are no deterministic transient dynamics, so we may write where V returns variance of a random variable. In this equation, we take x = x ss = const, since we have assumed a low-diffusion limit. We observe that this equation is of precisely the same form as (Equation 12) of Johnston and Jones (2016), except with an additional proportionality factor of f s induced by the inclusion of a mitochondrial network.

Heteroplasmy variance relations for alternative model structures and modes of genetic mtDNA control
Here we explore the implications of alternative model structures upon Eq. (S63). Firstly, we may consider replacing Eq. (4) with This corresponds to the case where replication coincides with fission, see (Lewis et al., 2016). Repeating the calculation in the previous section also results in Eq. (S63), so the result is robust to the particular choice of mtDNA replication reaction (see GitHub repository for Mathematica notebook). Secondly, we may explore the impact of allowing non-zero mtDNA degradation of fused species. This could correspond to autophagy-independent degradation of mtDNA, for example via the exonuclease activity of POLG (Medeiros et al., 2018). To encode this, we may add the following additional reaction where 0 ≤ ξ ≤ 1. We were not able to make analogous analytical progress in this instance. However, numerical investigation ( Figure S3E) revealed that the following ansatz was able to predict heteroplasmy variance dynamics In other words, allowing degradation of fused species results in a linear correction to our heteroplasmy variance formula in Eq. (13). If fused species are susceptible to degradation at the same rate as unfused species (ξ = 1), then V(h) loses f s dependence entirely and the mitochondrial network has no influence over heteroplasmy dynamics.
We also explored various different forms of λ(x) and µ(x), which we label A-G after (Johnston and Jones, 2016), and X-Z for several newly-considered functional forms, see Table S3 and Figure S4A -I. Control D of (Johnston and Jones, 2016) involves no feedback, which we do not explore -see Figure S1, and the discussion surrounding Eq. (S1). The argument presented in the previous section requires the steady state solution of the system to be solvable, since we require the explicit form of x ss in Eq. (S57), Eq. (S59) and Eq. (S61). For controls B, C, E, F, G, Y and Z in Table S3, the steady states are solvable and similar arguments to the above can be applied (see the GitHub repository for Mathematica notebooks). Controls B, C, E, F all satisfy Eq. (S63); this can be shown numerically for controls A and X. However, controls G, Y and Z satisfy Notably, Eq. (S70) does not depend on f s , unlike Eq. (S63) (see GitHub repository for Mathematica notebooks). This is because control of copy number occurs in the degradation rate, rather than the replication rate, for controls G, Y and Z. A modified version of a Moran process (presented below) can provide intuition for why the diffusion rate of heteroplasmy variance depends on the network state when the population is controlled through replication, and does not depend on network state when the population is controlled through degradation.

Choice of nominal parametrization
In this section we discuss our choice of nominal parametrization for the network system in Eq. (1)-Eq. (9), given the replication rate in Eq. (10). We will first discuss our choice of network parameters. Cagalinec et al. (2013) found that the average fission rate in cortical neurons is 0.023±0.003 fissions/mitochondria/min. Assuming that this value is representative of the fission rate in general, and converting this to units of per day, we may write the mitochondrial fission rate as β = 33.12 day −1 .
(3) isâ fis,w = βw f then the mean time to the next event is 1/(βw f ); therefore the dimension of β is per unit time and copy numbers are pure numbers, i.e. dimensionless. Similar reasoning constrains the dimension of the fusion rate, see below.
Evaluation of the fusion rate is more involved, since fusion involves two different chemical species coming together to react whereas fission may be considered as spontaneous. Furthermore, there are 7 different fusion reactions whereas there are only 2 fission reactions. For simplicity, assume that all species have a steady-state copy number of x i = 250 (resulting in a total copy number of 1000, heteroplasmy of 0.5 and 50% of mitochondria existing in the fused state). Neglecting subtleties relating to bimolecular reactions involving one species (see Eq. (S20)), each fusion reaction proceeds at rateâ fus,j ≈ γx 2 i . Since there are 7 fusion reactions (Eq. (1), Eq. (2), Eq. (7)-Eq. (9)), the total fusion propensity isâ fus ≈ 7γx 2 i . Similarly, the total fission propensity isâ fis = β(w f + m f ) = 2βx i . Since we expect macroscopic proportions of both fused and fissioned species in many physiological settings, we may equate the fusion and fission propensities, a fus =â fis , and rearrange for the fusion rate γ to yield γ = 2βx i /(7x 2 i ) ≈ 3.8 × 10 −2 day −1 . The orders of magnitude difference between β and γ stems from the observation that fusion propensity depends on the square of copy number whereas the fission propensity depends on copy number linearly.
Given the network parameters, we then explored appropriate parametrizations for the genetic parameters: the mitophagy rate (µ) and the parameters of the linear feedback control (κ, b and δ, see Eq. (10)). mtDNA half-life is observed to be highly variable: in mice this can be between 10-100 days (Burgstaller et al., 2014a). For consistency with another recent study investigating the relationship between network dynamics and heteroplasmy, we use an mtDNA half-life of 30 days (Tam et al., 2015).
The parameter δ in the replication feedback control (see Eq. (10)) may be interpreted as the "strength of sensing of mutant mtDNA" in the feedback control (Hoitzing et al., 2017). Assuming that fluctuations in copy number of mutants and wild-type molecules are sensed identically (as may be the case for e.g. non-coding mtDNA mutations) we may reasonably assume a model of δ = 1 as the simplest case of a neutral mutation (although δ = 1 still defines a neutral model, since both mutant and wild-type alleles experience the same replication and degradation rates per molecule, see Eqs.(4)-(6)).
We are finally left with setting the parameters κ and b in the linear feedback control Eq. (10). In the absence of a network state and mutants, κ is precisely equal to the steady state copy number, since the degradation rate equals the replication rate when κ = w. However, the presence of a network means that a subpopulation of mtDNAs (namely the fused species) are immune to death, resulting in κ no longer being equivalent to the steady state copy number. The parameter b may be interpreted as the feedback control strength, which determines the extent to which the replication rate changes given a unit change in copy number.
Given a particular value of b, we may search for a κ which gives a total steady state copy number (n) which is closest to some target value (e.g. 1000 as a typical total mtDNA copy number per cell in human fibroblasts (Kukat et al., 2011)). We swept a range of different values of b and found that, for values of b smaller than a critical value (b * ), a κ could not be found whose deterministic steady state was sufficiently close to n = 1000. This result is intuitive because in the limit of b → 0, λ = const. From the analysis above we have shown that constant genetic rates (µ, λ) result in unstable copy numbers, and therefore a sufficiently small value of b is not expected to yield a stable non-trivial steady state solution. We chose b ≈ b * , and the corresponding κ, such that the steady state copy number is controlled as weakly as possible given the model structure.

Rate renormalization
In Eqs. (1)-(10) we have neglected reactions such as because they do not change the number of molecules in our state vector x = (w s , w f , m s , m f ). One may ask whether neglecting such reactions means that it is necessary to renormalize the fission-fusion rates which were estimated in the preceding section. In estimating the nominal parametrization above, we began by using a literature value for the mitochondrial fission rate, and then matched the fusion rate such that the summed hazard of a fusion event approximately balanced the fission rate. This matching procedure is reasonable, since we observe a mixture of fused and fissioned mitochondria under physiological conditions: choosing a fusion rate which is vastly different results in either a hyperfused or fragmented network. We must therefore only justify the fission rate. Eq. (3) assumes that a fission reaction always results in a singleton, and a singleton is by definition a molecule which is susceptible to mitophagy (see Eq. (6)). Therefore, if fission reactions always result in mitochondria containing single mtDNAs which are susceptible to mitophagy, then we expect our model to match well to true physiological rates. If, on the other hand, fission reactions between large components of the network which are too large to be degraded are common, then renormalization of β by the fraction of fission events which result in a sufficiently small mitochondrion would be necessary, which would in turn renormalize γ through our matching procedure. We are not aware of experimental measurements of the fraction of fission events which result in mitochondria which contain a particular number of mtDNAs. Such an experiment, combined with the distribution of mitochondrial sizes which are susceptible to mitophagy, would allow us to validate our approach. Despite this, the robustness of our results over approximately 4 orders of magnitude for the fission-fusion rate ( Figure S4A-I) provides some indication that our results are likely to hold in physiological regions of parameter space.
A modified Moran process may account for the alternative forms of heteroplasmy variance dynamics under different models of genetic mtDNA control We sought to gain insight into why control of population size through the replication rate (λ = λ(x), µ = const) results in heteroplasmy variance depending on the fraction of unfused mitochondria (see Eq. (13)), whereas control of population size through the degradation rate (µ = µ(x), λ = const) results in heteroplasmy variance becoming independent of network state, where We will proceed by considering an analogous Moran process to the set of reactions presented in Eqs.
(1) (9). First, consider a haploid biallelic Moran process consisting of wild-types and mutants, in a population of fixed size n. At each step in discrete time, a member of the population is chosen for birth, and another for death. Let m t denote the copies of mutants at time t. Then, Defining h t := m t /n then from Eq. (S73) and therefore Suppose that, instead of the process occurring with discrete time, instead the process occurs with continuous time, where each event is a simultaneous birth and death, and is modelled as a Poisson process. Suppose that events occur at a rate µ per capita. The waiting time between successive events (τ ) is an exponential random variable with rate µN . Hence the expected waiting time between successive events is If we take the ratio of Eq. (S76) and Eq. (S77), we have Heuristically, one could interpret Eq. (S78) as a ratio of differentials as follows. If we were to suppose that n were large enough such that E(τ ) is very small, and h t is approximately constant (h 0 ) after a small number of events, then where we have replaced the inter-event time τ with physical time t. This result is analogous to Eq. (S72) and Eq. (12) of Johnston and Jones (2016), and agrees with simulation ( Figure S6A). Now consider the modified Moran process in Figure S6B, which we refer to as a "protected" Moran process. Let 0 < f s ≤ 1 be the fraction of individuals susceptible to death, which is a constant. nf s h t and nf s (1 − h t ) mutants and wild-types are randomly chosen to be susceptible to death, respectively, where n is large. In this continuous-time model, the inter-event time is τ ∼ Exponential(Γ) where Γ will be defined below. Then an individual from the susceptible population is chosen for death, and any individual is allowed to be born. The birth and death events occur simultaneously in time.
Again, using t as an integer counter of events, we have which is equivalent to the definition of a Moran process in Eq. (S73), meaning that Eq. (S76) applies to the protected Moran process as well. We consider two heuristic arguments for choosing the inter-event rate Γ, where the inter-event time τ ∼ Exponential(Γ). Firstly, if the death rate per capita is constant (µ), then the rate at which a death event occurs in the system (Γ death ) is proportional to the number of individuals which are susceptible to death: Γ death = µnf s . If we assume that the overall birth rate is matched to the overall death rate so that population size is maintained, as is the case when λ = λ(x) in the network system, then the overall birth rate (Γ birth ) must also be Γ birth = µnf s . Hence, where µ is a proportionality constant. Since, for a Moran event to occur, both a birth and a death event must occur, time effectively runs twice as fast in a Moran model relative to a comparable chemical reaction network model. We therefore rescale time by taking µ → µ/2, and thus Γ = µnf s .
As a result, E(τ ) = 1/(µnf s ) and therefore, using Eq. (S76) and the reasoning in Eq. (S79), This is analogous to when λ = λ(x) and µ = const in the network system. Hence, when λ = λ(x) and µ = const, the absence of death in the fused subpopulation means the timescale of the system (being the time to the next death event) is proportional to f s . This argument is only a heuristic, since the Moran process is defined such that birth and death events occur simultaneously and therefore do not possess separate propensities (Γ birth and Γ death ). The second case we consider is when each individual has a constant rate of birth, hence Γ birth ∝ n. Then the death rate is chosen such that Γ birth = Γ death . In this case Γ = λn, where λ is a proportionality constant. The same argument from Eq. (S77) to Eq. (S79) may be applied, with an appropriate rescaling of time, and we arrive at Eq. (S72). This is analogous to when µ = µ(x) and λ = const in the network system. Hence, when µ = µ(x) and λ = const, the presence of a constant birth rate in the entire population means the timescale of the system (being the time to the next birth event) is independent of f s . Table S1. Key predictions from our mathematical models.
The following results hold for our neutral genetic model of a post-mitotic cell, with a simple model of mitochondrial network dynamics: 1. The rate of increase of heteroplasmy variance is proportional to the fraction of unfused mitochondria, but independent of the absolute magnitude of fission-fusion rates, due to a rescaling of time by the mitochondrial network (Eq.(13), Eq.(S63), Figure 2E-H).
2. The rate of accumulation of de novo mutations increases as the fraction of unfused mitochondria increases, due to a rescaling of time by the mitochondrial network ( Figure 3B-D) 3. When fusion is selective, intermediate fusion-fission ratios are optimal for reducing mean heteroplasmy ( Figure 4A) 4. When mitophagy is selective, complete fission is optimal for reducing mean heteroplasmy ( Figure 4B) Figure S1. Phase portrait for an ODE representation of a network system with constant rates. The system displays two steady states: a trivial steady state at s + f = 0, and a non-trivial steady state. At a steady state, both time derivatives in Eq. (S8) and Eq. (S9) vanish. Trajectories (blue) show the evolution of the system up to t = 1000. The direction and magnitude of the derivative at points in space are shown by red arrows. Trajectories can be seen either decaying to s = f = 0 or tending to infinity. γ = 0.01656, β = 33.12, ρ = 0.023, η = 1.1, α = 1.04. Chosen as the weakest control strength which has a non-trivial steady state and total copy number of 1000 κ Steady state copy number parameter 11.7 dimensionless See remark for b Figure S2. Deterministic treatment of network system. (A) Deterministic dynamics of total copy number under linear feedback control, which is controlled to a particular steady state value (see Figure 2A&B). (B) Defining a knock-down (KD) factor (k −1 = 0.1, 0.2 . . . , 1.0), the fission rate was rescaled to β → β/k (red) and the fusion rate to γ → γ/k (blue), causing a linear increase and decrease in total copy number respectively under a deterministic treatment (see Figure 2A&B). Figure S3. Stochastic treatment of network system. (A) Copy number variance for stochastic simulations initially increases, since all stochastic simulations begin with the same initial condition, but then plateaus since the steady state line is globally attracting (see Figure 2C&D). (B) Error in Eq. (13) in a sweep over the feedback control strength, b. Dotted line denotes a 5% error according to Eq. (12). (C) V(h) profile for the parametrization with the largest error in (B). (D) Sweeps of the network rate magnitude (see Figure 2H). Heteroplasmy variance is approximately independent of absolute network rates over a broad range of network magnitudes. (E) Allowing fused species to be degraded with relative rate ξ (Eq. (S68)), stochastic simulations for heteroplasmy variance (markers) and Eq. (S69) (lines) are shown. Fused species degradation induces a linear correction to the heteroplasmy variance formula.  Table S3. Equations are accurate to at least 5% (blue regions) across large regions of parameter space, for many control laws. Fusion and fission rates are redefined as γ → γ0M R and β → β0M where M and R denote the magnitude and ratio of the network rates, and γ0, β0 denote the nominal parametrizations of the fusion and fission rates respectively. Summary statistics for 10 4 realizations, with initial condition h = 0.3 and evaluated at t = 500 days. Errors in V(h) (see Eq. (12)) smaller than 5% are truncated and are shown as blue. Parametrizations where a deterministic steady state could not be found for an initial condition of h = 0.3 are shown in grey. Inset figures, where present, display the probability of fixation at h = 0. Where insets are not present, the probability of fixation is negligible. (A-F) When λ = λ(x) and P (h = 0) is low, Eq. (13) performs well in the high-churn limit. (G-I) When µ = µ(x) and P (h = 0) is low, Eq. (S72) performs well in the high-churn limit.  Fig. S4. In all cases, the nominal fission and fusion rates were β = 33.12, γ = 0.038 respectively.

Control Y in GitHub repository
Differential control for target population in degradation (see Figure  S4I) λ; λ = 0.023 α(w T − w opt ); α = 1, w opt = 1000 Control Z in GitHub repository Figure S5. Ansatz predicts heteroplasmy variance for linear feedback control in a fast mtDNA turnover regime when fixation probability is low. Stochastic simulations of the linear feedback control network system with an mtDNA half-life of 2 days (Poovathingal et al., 2012), corresponding to µ = ln (2) Table S2.

A B
Choose mutants and wildtypes to be susceptible to death Figure S6. Exploration of analogous Moran processes. (A) The original biallelic Moran process satisfies Eq. (S72), where h0 is the initial heteroplasmy, which is equivalent to E(h). (B) The "protected" Moran process. The population size is constrained to be fixed to some large constant, n. There exist two alleles, mutants (black circle) and wild-types (white circles). hnfs mutant and (1 − h)nfs wild-type molecules are susceptible to death, the rest are protected from death (denoted by a bar). An exponential random variable is drawn as the waiting time to the next event (see A modified Moran process may account for the alternative forms of heteroplasmy variance dynamics under different models of genetic mtDNA control for discussion of the form of the rate Γ). Time is incremented by the waiting time, then a death event occurs from the susceptible population and a birth event from any individual simultaneously. The same individual is allowed to be chosen for both birth and death. The process is then repeated iteratively. Figure S7. A deterministic parameter sweep of fusion selectivity and the relative fusion rate for mitochondrial quality control. An ODE treatment allows smaller heteroplasmy changes to be probed without the need to resort to an infeasible number of stochastic simulations. Displaying the relative change in heteroplasmy (∆h) after t = 1000 days. We observe that a reduction in heteroplasmy is achieved at intermediate fusion rates at all non-zero fusion selectivities investigated. Grey denotes a change which is smaller than floating point precision. Table S4. Comparison of previous models with our model of mitochondrial genetic/network dynamics. Each of the key differences with our model is enumerated, and has a corresponding comment; see Discussion.