Volume 55 Issue 09 November 2022
Research

Learning for Dynamical Systems When Data Are Scarce

Learning-based techniques promise the possibility of modeling unknown dynamical systems and thus facilitating predictions of values of interest, synthesizing control strategies, and verifying the safety of closed-loop systems. Many such algorithms assume the availability of data that span all types of relevant behaviors for training purposes. However, employment of the existing, purely data-driven techniques for dynamical systems with a physical embodiment can result in poor data efficiency, fail to generalize beyond the training domain, and even violate the underlying laws of physics. Such deficiencies are particularly apparent when the training dataset is relatively small, as is the case with practically all dynamical systems applications with a physical embodiment — especially those in dynamic and possibly contested environments.

Physics can and should inform learning for dynamical systems. We seldom need to learn merely from data because useful a priori system knowledge is often available, even when the exact dynamics are unknown. Such knowledge may stem from a variety of sources, such as basic physical principles, geometry constraints that arise from the system’s design, structural properties of the vector field (e.g., decoupling between the states), monotonicity of or bounds on the vector field, algebraic constraints on the states, and/or empirically-validated invariant sets in the state space. This knowledge may present itself in various representations, including parametric and functional forms.

The effective inclusion of a priori knowledge when learning for dynamical systems can ensure that the learned model respects physical principles; it also significantly improves data efficiency and model generalization to previously unseen regions of the state space. Data and knowledge can collectively enable the rapid deployment of learning-based control strategies for systems with physical embodiments, in addition to reliable operation in the face of unexpected changes.

Here we exemplify the effectiveness of the joint use of data and physics-based knowledge in two settings with severe data scarcity. The first involves learning after an abrupt change in a system’s dynamics during its operation, when learning is limited to data that essentially come from a single trajectory (see Figure 1). And the second scenario entails training deep neural networks for dynamical systems modeling to match or exceed the accuracy of conventional, data-driven deep learning methods while also relying on a number of trajectories that is multiple orders of magnitude fewer. Although the suitable learning artifacts and expectations from learning differ between these two settings, the types, representations, and means of incorporating physics-based knowledge into learning are based on similar principles.

<strong>Figure 1.</strong> The incorporation of physics-based side information into learning on the fly yields high performance in situations that would otherwise be challenging—if not impossible—for merely data-driven learning approaches. <strong>1a.</strong> Recovery of an F-16 aircraft with <em>a priori</em> unknown dynamics from a low-altitude, nose-down configuration. <strong>1b.</strong> A comparison between state-of-the-art reinforcement learning algorithms and pre-training via extensive data. Figure courtesy of Ufuk Topcu and concepts courtesy of [1].
Figure 1. The incorporation of physics-based side information into learning on the fly yields high performance in situations that would otherwise be challenging—if not impossible—for merely data-driven learning approaches. 1a. Recovery of an F-16 aircraft with a priori unknown dynamics from a low-altitude, nose-down configuration. 1b. A comparison between state-of-the-art reinforcement learning algorithms and pre-training via extensive data. Figure courtesy of Ufuk Topcu and concepts courtesy of [1].

Control-oriented Learning on the Fly

Consider an extreme yet realistic scenario wherein an aerial vehicle suffers from an abrupt change in its dynamics—e.g., due to severe structural damage or loss of an engine—and attempts to retain control. This circumstance necessitates that researchers learn the dynamics via data from only a single (and ongoing) trajectory. Under such severe data limitations, one can only perform learning efficiently by incorporating existing side information about properties of the underlying unknown dynamics. It is worth emphasizing that the availability of side information does not imply that the dynamics—or even an a priori parametrization of them—are known.

Figure 2 depicts the workflow of a recently proposed method. Using existing data from the ongoing run as well as side information, this method computes an overapproximation of the set of states that the system may reach (see steps 2 and 3 in Figure 2). Data-driven differential inclusions that contain the unknown vector field offer control-oriented representations of these overapproximations. The method computes such differential inclusions using an interval Taylor-based technique that can enforce constraints from side information to reduce the width of the overapproximations as more data become available. It then incorporates an overapproximation into a constrained short-horizon optimal control problem (see steps 4 and 5 in Figure 2) and solves this problem on the fly by resorting to convex relaxations that are optimistic or pessimistic. One can provide a theoretical bound (which is also practically relevant) on the number of primitive operations that are necessary to compute the control values, thus ensuring real-time implementability.

<strong>Figure 2.</strong> The workflow for learning on the fly. Figure courtesy of Ufuk Topcu and concepts courtesy of [1].
Figure 2. The workflow for learning on the fly. Figure courtesy of Ufuk Topcu and concepts courtesy of [1].

Dynamical Systems Models Using Neural Networks with Physics-informed Architecture and Constraints

While the first method focuses on learning over a single trajectory (and is hence only concerned with discovering knowledge that will influence the evolution along that trajectory), our second method focuses on learning complex relationships—encoded as neural networks—that model unknown dynamical systems.

This method uses neural networks to parametrize the dynamical system’s vector field and numerical integration schemes to predict future states, rather than parametrizing a model that predicts the next state directly (see Figure 3). Two distinct mechanisms help incorporate physics-based knowledge as inductive bias. The first mechanism informs the choice of architecture for the vector field as a composition of unknown terms that are parametrized as neural networks and known terms that are derived from a priori knowledge:

\[\dot{x}=h(x,u)=F(x,u,g_1(\cdot),...,g_d(\cdot)).\]

Here, \(F\) is a known differentiable function that encodes available prior knowledge and the functions \(g_1,...,g_d\) encode the unknown terms. For example, Figure 4 illustrates a robotics environment whose equations of motion are only partially known; although the robot’s mass matrix is available, the remaining terms in its equations of motion (i.e., its actuation and contact forces) must be learned. One can then compose neural networks that represent the unknown terms with the known mass matrix to obtain a model of the system’s vector field.

<strong>Figure 3.</strong> A structured representation of the vector field (in blue) captures <i>a priori</i> physics-based knowledge, while neural networks (in red) represent unknown components. Physics-based constraints are enforced on the outputs, not only on the labeled training data points but also on any unlabeled points within the state space where the constraints are known to hold (in yellow). Figure adapted from [2].
Figure 3. A structured representation of the vector field (in blue) captures a priori physics-based knowledge, while neural networks (in red) represent unknown components. Physics-based constraints are enforced on the outputs, not only on the labeled training data points but also on any unlabeled points within the state space where the constraints are known to hold (in yellow). Figure adapted from [2].

The second mechanism enforces physics-based constraints on the values of the model’s outputs and internal states. Consider a particular physics-informed model \(F(x,u,G_\Theta)\) of the vector field with unknown terms that are parametrized by the collection of neural networks \(G_\Theta\). The objective is to solve for a set of parameter values that minimize the loss \(J(\Theta)\) over the training dataset while also satisfying all of the known physics-based constraints:

\[\underset{\Theta}{\min} \mathcal{J}(\Theta) \quad \textrm{s.t.} \quad \Phi(x,u,G_\Theta)=0, \enspace \forall (x,u)\in\mathcal{C}_\Phi \quad \textrm{and} \quad \Psi(x,u,G_\Theta)\le 0, \enspace \forall (x,u) \in \mathcal{C}_{\Psi}.\]

Here, \(\Phi(x,u,G_\Theta)\) and \(\Psi(x,u,G_\Theta)\) are differentiable functions that respectively capture the physics-informed equality and inequality constraints (of which there may be a multitude).

While only a limited number of data points might be available for supervised learning, constraints that are derived from a priori knowledge will hold over large subsets of the state space — and potentially over the entire state space. Ensuring that the learned model satisfies the relevant constraints throughout the space—while also fitting the available trajectory data—yields a semi-supervised learning scheme that generalizes the available training data to the unlabeled portions of the state space.

<strong>Figure 4.</strong> Representative results that utilize a suite of robotics benchmarks [3]. <strong>4a.</strong> A suite of simulated robotic systems for demonstration. <strong>4b – 4c.</strong> Selected results demonstrate the utility of incorporating different types of physics-based side knowledge, including <strong>(4b)</strong> improvement in test performance and <strong>(4c)</strong> compliance with the underlying laws of physics — even in parts of the environment with no training data. Figure adapted from [2].
Figure 4. Representative results that utilize a suite of robotics benchmarks [3]. 4a. A suite of simulated robotic systems for demonstration. 4b – 4c. Selected results demonstrate the utility of incorporating different types of physics-based side knowledge, including (4b) improvement in test performance and (4c) compliance with the underlying laws of physics — even in parts of the environment with no training data. Figure adapted from [2].

Outlook

The potential of learning with insights from diverse sources of data and existing knowledge extends far beyond the two examples that we discuss here and has attracted considerable attention as of late. Existing approaches include physics-informed neural networks, neurosymbolic reinforcement learning, and learning with statistical invariants. We argue that such hybrid approaches will be key during the establishment of appropriate provable guarantees of safety, robustness, and generalizability within the data, as well as computational and perceptual limitations of dynamical systems with a physical embodiment.


Acknowledgments: Ufuk Topcu is supported by the Air Force Office of Scientific Research’s Multidisciplinary Research Program of the University Research Initiative on “Verifiable, Control-oriented Learning on the Fly.”

References

[1] Djeumou, F., & Topcu, U. (2022). Learning to reach, swim, walk and fly in one trial: Data-driven control with scarce data and side information. In Proceedings of the learning for dynamics and control conference 2022 (pp. 453-466). Stanford, CA.
[2] Djeumou, F., Neary, C., Goubault, E., Putot, S., & Topcu, U. (2022). Neural networks with physics-informed architectures and constraints for dynamical systems modeling. In Proceedings of the learning for dynamics and control conference 2022 (pp. 263-277). Stanford, CA.
[3] Freeman, C.D., Frey, E., Raichuk, A., Girgin, S., Mordatch, I., & Bachem, O. (2021). Brax — a differentiable physics engine for large scale rigid body simulation. Preprint, arXiv:2106.13281

About the Authors