Digital Twins: Where Data, Mathematics, Models, and Decisions Collide
Regardless of whether one is entering the interdisciplinary field of computational science and engineering with a background in mathematics, statistics, computer science, engineering, or the sciences—or maybe a little bit of each—it is such an exciting time to be a computational scientist. The field is in the midst of a tremendous convergence of technologies that generate unprecedented system data and enable automation, algorithms that let users process massive amounts of data and run predictive simulations that drive key decisions, and the computing power that makes these algorithms feasible at scale for complex systems and in real-time or in situ settings.
The convergence of these three elements is a revolution that touches almost every area of science, engineering, and society. It affects all aspects of a complex system’s lifecycle by advancing simulation and modeling capabilities in each phase of design; impacting the way we plan, monitor, and optimize manufacturing processes; and changing the way we operate systems.
Digital Twins Integrate Data, Models, and Decisions
A digital twin is defined as “a set of virtual information constructs that mimics the structure, context, and behavior of an individual/unique physical asset, is dynamically updated with data from its physical twin throughout its lifecycle, and informs decisions that realize value” [1]. People often wonder how a digital twin differs from the modeling and simulation that researchers have been conducting for decades. The quoted definition highlights three key points of differentiation. First, a digital twin is personalized because it targets an individual or unique asset instead of being a generic computational model. This means that it must reflect asset-to-asset differences and variability. Second, a digital twin is a living model—not a static computational model—that evolves as the physical twin evolves. And third, a digital twin encapsulates an integrated end-to-end view of data, models, and decisions.
The community is beginning to see digital twins deployed for myriad engineering applications, including aircraft, spacecraft, buildings, bridges, engines, automobiles, wind turbines, and floating production storage and offloading units. As Figure 1 illustrates, there is also gathering momentum to develop digital twins in medicine and the geosciences. However, state-of-the-art digital twins are largely the result of custom implementations that require considerable deployment resources and a high level of expertise. Moving from the one-off digital twin to accessible robust digital twin implementations at scale requires rigorous and scalable mathematical foundations [4].
A Mathematical Foundation for Digital Twins
One example of a mathematical foundation for digital twins is a probabilistic graphical model [3]. Probabilistic graphical models provide a powerful mathematical abstraction for modeling complex systems in an insightful and intuitive way while also serving as a foundation for generalizable and scalable computational methods. For example, graphical models have seen wide application as a unified framework for modeling and inference in the field of robotics, where they enable tasks like perception, state estimation, and motion planning for a variety of robotic systems [2]. In the context of digital twins, a probabilistic graphical model enables data-driven asset monitoring; digital twin model updating; and model-based prediction, planning, and decision-making to all be formulated as probabilistic inference tasks. Furthermore, one can exploit the graphical model structure to develop principled and scalable algorithms that carry out these inference tasks.
To develop a probabilistic graphical model for digital twins, we first formulate a mathematical abstraction of the system, which is comprised of six key elements and defined in Figure 2. The partially observable physical asset state \(S_t\) is reflected in the digital twin state \(D_t\), which is informed by observational data \(O_t\). One can use the computational models that comprise the digital twin to predict quantities of interest \(Q_t\), which in turn inform control inputs \(U_t\) that influence the asset state. Definition of a reward \(R_t\) facilitates the analysis and optimization of the system. Here, the subscript \(t\) denotes a discrete time step.
A graphical model serves as a formal mathematical representation of the way in which each of these aforementioned elements interact with one another and how they evolve over time. Such models can take many specific forms, including Bayesian networks, factor graphs, or deep generative networks. Figure 3 illustrates a dynamic decision network—specifically a dynamic Bayesian network with the addition of decision nodes—for a digital twin system. Each quantity is modeled at a particular instant in time as a random variable, which is represented as a node in the graph. Edges signify relationships between quantities; these are typically encoded as conditional probability distributions. For example, the conditional probability
\[p(D_t | D_{t-1} = d_{t-1}, O_t=o_t)\]
models the transition in the digital state from timestep \(t-1\) to timestep \(t\), conditioned on observed data \(o_t\). Physics-based models that constrain the system dynamics according to physical governing equations may be embedded within the definition of this probability distribution. The distribution could also be data-driven, e.g., fitted to historical asset data.
The graph’s sparse connectivity encodes a set of known or assumed conditional independencies. One can exploit this conditional independence structure to factorize joint distributions over variables of interest in the model. For example, we can factorize our belief about the digital state \(D_t\), quantities of interest \(Q_t\), and rewards \(R_t\), conditioned on observed variables for all timesteps until the current time (namely the data \(O_t=o_t\) and enacted control inputs \(U_t=u_t\) for \(t \in \{0,...,t_c\}\)), according to the structure of the proposed graphical model as
\[p(D_0,...,D_{t_c},Q_0,...,Q_{t_c},R_0,...,R_{t_c} | o_0,...,o_{t_c},u_0,..., u_{t_c}) = \overset{t_c} {\underset{t=0} \Pi} [\phi^{\rm{update}}_{t}\phi^{\rm{QoI}}_t\phi^{\rm{evaluation}}_t],\tag1\]
where
\[\phi^{\rm{update}}_t=p(D_t | D_{t-1}, U_{t-1}=u_{t-1}, O_t=o_t)\tag2\] \[\phi^{\rm{QoI}}_t=p(Q_t | D_t)\tag3\] \[\phi^{\rm{evaluation}}_t=p(R_t | D_t, R_t, Q_t, U_t = u_t, O_t=o_t).\tag4\]
One can formulate predictions in a similar manner by extending this belief state to include digital state, quantity of interest, and reward variables up until the chosen prediction horizon \(t_p\).
Factored representations such as \((1)\) serve two purposes. First, each factor—denoted as \(\phi\) and defined in \((2)\) through \((4)\)—is a conditional probability distribution that exposes particular processes or interactions that must be characterized in a digital twin (e.g., asset state dynamics, data generation and observation, computational estimation of quantities of interest, and so forth). Second, the factorization serves as a foundation for deriving efficient sequential Bayesian inference algorithms that leverage the digital twin models to enable high-level digital twin functionality like asset monitoring, prediction, and optimization [3].
Outlook
The early successes of digital twin deployment point to the idea’s value and potential impact. Now is the time for the applied mathematics and computational science communities to develop the rigorous mathematical underpinnings and scalable algorithms that will take digital twins to the next level. As noted in [4], for inspiration we can look to the evolution of the finite element method from an expert-driven approach that required specialization for each different application to a broadly applicable analysis and design tool that is now in the hands of every engineer. This evolution has been enabled by foundational mathematical theory, computing scalability achieved through a combination of hardware and algorithmic advances, and flexible software implementations.
In order to advance digital twins to a similar level of maturity and accessibility, our community’s work in a variety of areas—including physics-based modeling, inverse problems, data assimilation, uncertainty quantification, optimal control, optimal experimental design, surrogate modeling, scientific machine learning, scalable algorithms, and scientific software—has an important role to play.
This article is based on Karen Willcox’s invited talk at the 2021 SIAM Conference on Computational Science and Engineering (CSE21), which took place virtually in March.
References
[1] AIAA Digital Engineering Integration Committee. (2020). Digital twin: Definition & value. (AIAA and AIA Position Paper). American Institute of Aeronautics and Astronautics and Aerospace Industries Association.
[2] Dallaert, F. (2021). Factor graphs: Exploiting structure in robotics. Ann. Rev. Cont. Robot. Auton. Syst., 4(1), 141-166.
[3] Kapteyn, M., Pretorius, J., & Willcox, K. (2021). A probabilistic graphical model foundation for enabling predictive digital twins at scale. Nat. Comput. Sci., 1(5), 337-347.
[4] Niederer, S., Sacks, M., Girolami, M., & Willcox, K., (2021). Scaling digital twins from the artisanal to the industrial. Nat. Comput. Sci., 1(5), 313-320.
About the Authors
Michael G. Kapteyn
Postdoctoral Fellow, University of Texas, Austin
Michael G. Kapteyn is a postdoctoral fellow in the Oden Institute for Computational Engineering and Sciences at the University of Texas at Austin. He recently received his Ph.D. in computational science and engineering from the Massachusetts Institute of Technology, where he worked in the Aerospace Computational Design Laboratory developing mathematical and computational foundations for digital twins.
Karen E. Willcox
Director, Oden Institute for Computational Engineering and Sciences
Karen E. Willcox is director of the Oden Institute for Computational Engineering and Sciences, Associate Vice President for Research, and a professor of aerospace engineering and engineering mechanics at the University of Texas at Austin. She is also an external professor at the Santa Fe Institute and a SIAM Fellow.