Identifiability and Sensitivity Analysis for Bayesian Parameter Estimation in Systems Biology
An ongoing challenge in systems biology—the field of computational biology that employs mathematical modeling to study signal transduction in cells—is the estimation of model parameters to constrain a model against experimental data [9]. Despite recent advances in experimental technologies, direct measurement of these parameters remains difficult. Researchers throughout the computational sciences have developed many parameter estimation methods, including nonlinear optimization and Bayesian inference [6, 7], but these methods are not yet widespread in systems biology.
![<strong>Figure 1.</strong> A comprehensive framework for parameter estimation and uncertainty quantification in systems biology. Figure adapted from [4].](/media/xbwh2yoq/figure1.jpg)
Why is parameter estimation so challenging in systems biology? Models are based on rate laws that describe biochemical reaction kinetics; in many cases, unknown system compositions or reaction mechanisms lead to uncertainty in a model formulation. Furthermore, researchers often approximate complex biochemical reactions with simplified rate laws—such as Michaelis-Menten kinetics and Hill equations—which contain rational nonlinearities and introduce multiple unknown parameters in each state equation. The strong nonlinearity in the states and parameters, as well as the large parameter count relative to the number of states, collectively result in parameter spaces that are difficult to explore with estimation algorithms. In conjunction with these modeling challenges, biological experiments are time-consuming, costly, and limited by available technologies. Experimental data therefore often include only a few time points (sparsity in time), measure a subset of the states (incompleteness), and are corrupted by various noise sources. Many different parameter sets will yield model predictions that fit this uncertain data equally well.
We recently developed a comprehensive framework for parameter estimation and uncertainty quantification of systems biology models (see Figure 1) [4]. We assumed that the models consist of ordinary differential equations
where
The first question is thus as follows: Which parameters can be uniquely estimated from the available data? Nonlinear models and incomplete data often complicate parameter estimation because parameters can be nonidentifiable — infinitely many values of a nonidentifiable parameter lead to the same model outputs [10]. Nonidentifiable parameters may then result in flat loss functions and probability distributions that are hard to optimize or sample with estimation algorithms. In addition, parameters that weakly influence model outputs are difficult to estimate because perturbations to such parameters cause small changes in the value of a loss function or probability distribution. To exclude nonidentifiable and noninfluential parameters from estimation, we applied structural global identifiability analysis [3] and Sobol’s variance-based global sensitivity analysis [7] (see step 1 in Figure 1).
We chose to use Bayesian inference for parameter estimation rather than an optimization-based approach because Bayesian methods provide a complete accounting of uncertainties via probability distributions [6, 7]. Consequently, we aimed to learn the probability densities for the random variables that model the parameters
to find the posterior distribution
![<strong>Figure 2.</strong> Marginal distributions for the four influential and identifiable parameters of the mitogen-activated protein kinase (MAPK) model in the bistable regime. The gray distributions correspond to results from the estimation of only these parameters; the blue distributions correspond to results from the estimation of all structurally globally identifiable parameters; and the green distributions correspond to results from the estimation of all model parameters. Figure adapted from [4].](/media/ohrpdesn/figure2.jpg)
To test our framework, we applied it to estimate the parameters of a model of the mitogen-activated protein kinase (MAPK) signaling pathway [5]. This pathway comprises a series of biochemical reactions responsible for various biological processes and implicated in several diseases. The MAPK model represents the dynamics of three biochemical species in the MAPK pathway via the equations
It incorporates 14 model parameters:
How does failing to eliminate nonidentifiable parameters affect parameter estimation? To answer this question, we performed a set of estimation experiments with the MAPK model that focused on the bistable regime and the influential parameters:
Next, we estimated the parameters that yield oscillatory dynamics given the noisy data in Figure 3b, focusing on the identifiable and influential parameters:
![<strong>Figure 3.</strong> Parameter estimation results for the mitogen-activated protein kinase (MAPK) model when using the constrained interval unscented Kalman filter Markov chain Monte Carlo (CIUKF-MCMC) method on the reduced model parameters. <strong>3a.</strong> Marginal posterior distributions. <strong>3b.</strong> Ensemble simulation with 30,000 random posterior samples. Figure adapted from [4].](/media/krjnn2jz/figure3.jpg)
Why did we only provide data that were sampled from the limit cycle? The common thought in parameter estimation is that including more data reduces estimation uncertainty. But for the MAPK oscillations, we found that adding more data points that were sampled from the initial transient (first 30 minutes) led to greater uncertainty in parameter estimates. Specifically, we had an approximately 90 percent chance of predicting a limit cycle oscillation with the original 15 data points that were only sampled from the oscillations. However, the probability of predicting a limit cycle dropped to roughly 45 percent when we used 30 data points that were sampled from both the initial transient and oscillations. We hypothesize that data quality—in this case, focused sampling of the oscillations—is more important in this scenario than data quantity.
In summary, the proposed framework enables parameter estimation and uncertainty quantification for systems biology models in the face of nonlinearity, nonidentifiability, limited and noisy data, and uncertain model formulations. Although the omission of nonidentifiable and noninfluential parameters was crucial to successful estimation, we believe that future efforts should focus on estimation methods that can handle nonidentifiable parameters. Additionally, we are actively working to apply our framework to models with high-dimensional parameter spaces and data from in vitro live cell imaging experiments. Ultimately, this work highlights an opportunity for computational science and mathematics to transform systems biology.
Nathaniel Linden delivered a minisymposium presentation on this research at the 2022 SIAM Conference on Mathematics of Data Science (MDS22), which took place in San Diego, Ca., last year. He received funding to attend MDS22 through a SIAM Student Travel Award. To learn more about Student Travel Awards and submit an application, visit the online page.
References
[1] Galioto, N., & Gorodetsky, A.A. (2020). Bayesian system ID: Optimal management of parameter, model, and measurement uncertainty. Nonlinear Dyn., 102, 241-267.
[2] Goodman, J., & Weare, J. (2010). Ensemble samplers with affine invariance. Comm. Appl. Math. Comput. Sci., 5(1), 65-80.
[3] Hong, H., Ovchinnikov, A., Pogudin, G., & Yap, C. (2019). SIAN: Software for structural identifiability analysis of ODE models. Bioinformatics, 35(16), 2873-2874.
[4] Linden, N.J., Kramer, B., & Rangamani, P. (2022). Bayesian parameter estimation for dynamical models in systems biology. PLoS Comput. Biol., 18(10), e1010651.
[5] Nguyen, L.K., Degasperi, A., Cotter, P., & Kholodenko, B.N. (2015). DYVIPAC: An integrated analysis and visualisation framework to probe multi-dimensional biological networks. Sci. Rep., 5, 12569.
[6] Oden, J.T., Babuška, I., & Faghihi, D. (2017). Predictive computational science: Computer predictions in the presence of uncertainty. In Encyclopedia of computational mechanics (2nd ed.) (pp. 1-26). New York, NY: John Wiley & Sons, Ltd.
[7] Smith, R.C. (2013). Uncertainty quantification: Theory, implementation, and applications. In Computational Science and Engineering Series (Vol. 12). Philadelphia, PA: Society for Industrial and Applied Mathematics.
[8] Teixeira, B., Torres, L.A.B., Aguirre, L.A., & Bernstein, D.S. (2008). Unscented filtering for interval-constrained nonlinear systems. In Proceedings of the 47th IEEE Conference on Decision and Control (CDC 2008) (pp. 5116-5121). Cancún, México: Institute of Electrical and Electronics Engineers.
[9] Vittadello, S.T., & Stumpf, M.P.H. (2022). Open problems in mathematical biology. Math. Biosci., 354, 108926. [10] Wieland, F.-G., Hauber, A.L., Rosenblatt, M., Tönsing, C., & Timmer, J. (2021). On structural and practical identifiability. Curr. Opin. Syst. Biol., 25, 60-69.
About the Authors
Nathaniel Linden
Ph.D. Student, University of California, San Diego
Nathaniel Linden is a Ph.D. student in the Department of Mechanical and Aerospace Engineering at the University of California, San Diego. His research involves developing and applying computational methods for model calibration and uncertainty quantification in systems biology.

Padmini Rangamani
Professor, University of California, San Diego
Padmini Rangamani is a professor of mechanical and aerospace engineering at the University of California, San Diego. Her research uses mathematical and computational methods to study the complex interplay between cellular biology, mechanics, and biochemistry.

Boris Kramer
Assistant Professor, University of California, San Diego
Boris Kramer is an assistant professor of mechanical and aerospace engineering at the University of California, San Diego. He develops reduced-order models to enable the control, design, optimization, and uncertainty quantification of complex and large-scale systems.

Stay Up-to-Date with Email Alerts
Sign up for our monthly newsletter and emails about other topics of your choosing.