# The Weak Form is Stronger Than You Think

For a broad class of differential equations, one can obtain the *weak form* by multiplying both sides of the equation with a sufficiently smooth function \(\phi\), integrating over a domain of interest \(\Omega\), and using integration by parts to obtain a new equation with fewer derivatives. It is a ubiquitous, well studied, and widely utilized tool in modern computational and applied mathematics. Yet while the convolution of data with \(\phi\) (or \(\partial_t \phi\), \(\partial_x \phi\), etc.) can filter noise, conversion to the weak form is more powerful than just smoothing the data; the choice of a test function asserts a topology or scale through which to view the equation. Indeed, recent advances suggest that with a data-driven topology (encoded in the form of \(\phi\)), weak form versions of equation learning, parameter estimation, and coarse graining offer surprising noise robustness, accuracy, and computational efficiency.

#### Governing Equations

Researchers have been studying computational methods for scientific model discovery for decades, and recent years have seen an explosion of activity based on the sparse identification of nonlinear dynamics (SINDy) method [2]. SINDy learns the nonzero weights \(\{w_j\}^J_{j=1}\)—which correspond to discovered terms in a library of candidate functions \(\{f_j\}^J_{j=1}\)—by using an *equation error* (EE)-based sparse regression \(\|\partial_t \mathbf{U}-\Sigma^J_{j=1}w_j f_j(\mathbf{U})\|^2_2\) for data \(\mathbf{U}\).

Although EE regression methods are computationally efficient, the use of noisy data presents a significant challenge due to a known bias in the resulting parameter estimates and the need to approximate derivatives of the data (e.g., \(\partial_t\mathbf{U}\)). Several groups have built upon SINDy and independently discovered that learning by means of the weak form of the model both bypasses the derivative approximation question and is highly robust to noise [3]. The core idea is that multiplying both sides of an equation with a compactly supported test function \(\phi \in C^\infty _\textrm{c}(\Omega)\) allows the movement of derivatives from the state variables to the test function. To illustrate this concept, consider a feature library that consists of spatial derivatives up to order \(K\) that act on polynomials up to order \(P\). In this case, the weak form EE residual is \(\|\langle\partial_t\phi,\mathbf{U}\rangle+\Sigma^K_{k=0}\Sigma^P_{p=0}(-1)^k w_{k,p}\langle\partial^k_x\phi,\mathbf{U}^p\rangle\|^2_2.\)

Figure 1 details the use of the weak form equation learning framework to discover the Kuramoto-Sivashinsky partial differential equation (PDE) in the presence of 50 percent additive independent and identically distributed Gaussian measurement noise, with a three-decibel signal-to-noise ratio. In this example, the candidate library encompasses all unique operators \(u \mapsto\partial^k_x(u^p)\) for \(0\le k\), \(p \le 6\) — a total of 43 terms that contain the true three-term model. It is important to note that making a mathematically justified choice for the test function \(\phi\)—*and hence the topology*—is critical to performance. Here, we match the spectral properties of the test functions to those of the data [3] to filter high-frequency noise and preserve the solution signal. By centering shifted copies of test functions on each sample point, we can create a system of equations—i.e., a regression problem for the coefficients \(\mathbf{w}\)—and yield a method for accurate PDE discovery from highly noisy data in less than a second on a modern laptop. This ability is in direct contrast to strong form methods; for example, data with more than one percent noise will prevent (strong form) SINDy from learning the Navier-Stokes equation.

The discovery capabilities of the weak form are broader than simply finding a canonical PDE or ordinary differential equation to describe the data. For example, asymmetric force potentials that model attraction/repulsion, alignment, and drag can be learned for *each particle* in a deterministic interacting particle system (IPS) model of collective motion. In Figure 2a, the gray unlabeled trajectories illustrate the motion of a heterogeneous population wherein a common force model governs subsets of particles. In less than 10 seconds, a weak form method—in this case, weak SINDy (WSINDy)—can rapidly and parallelizably learn particle-specific potentials that lead to accurate trajectory predictions. This method can even cluster models to discover population structures (e.g., the teal curves in Figure 2a are from a single subpopulation), thus serving as a novel tool with which biologists can study cell population heterogeneity based on movement trajectories [8].

The weak form can also augment existing techniques, such as the creation of reduced order models (ROMs) from noise-corrupted or stochastic data. For example, we can extend the latent space dynamics identification (LaSDI) method via the weak form (WLaSDI) to robustly learn ROM dynamics [12]. Figure 2b illustrates the results of WLaSDI’s application to noisy measurements of a reaction-diffusion system’s solution, which yields a ROM with 200 times speedup and roughly four percent solution error (with the same data, a LaSDI ROM has more than 100 percent solution error). Even when we increase the noise level to 100 percent, WLaSDI still returns a ROM with less than 10 percent relative error; in contrast, a LaSDI ROM has more than 200 percent error [12].

#### Parameter Estimation

Researchers have utilized regression with EEs for parameter estimation since at least the 1950s; in fact, Marvin Shinbrot proposed a weak form of the system equations in 1954 [10]. The successor of this approach is the *modulating function *method. However, several factors have prevented widespread adoption of this class of weak form methods: (i) the challenge of selecting the test function \(\phi\), (ii) a known statistical bias in EE-based inference, and (iii) the ready availability of software that uses output error methods to match a model solution to data. We recently proposed the weak form estimation of nonlinear dynamics (WENDy) parameter inference method, which includes an automated strategy for the creation of orthogonal \(\phi\)s from multiresolution \(C^\infty_c\) functions that are merged with a generalized least squares approach to address statistical issues [1]. The combination of these two techniques generates substantial improvement in both computation time and inference accuracy. Figure 2c portrays the relative errors versus walltime in the use of WENDy to estimate parameters for the Kuramoto-Sivashinsky PDE from data with 20 percent noise. In most cases, WENDy is at least an order of magnitude more accurate and more than an order of magnitude faster than conventional output error methods [1].

#### Coarse Graining

*Coarse graining* is the process of mapping a first principles model to a lower-order one; the technique is characterized by effective descriptions of small-scale dynamics via larger-scale quantities of interest. In many cases, we derive a solution to the coarse-grained model as a limit of solutions to the first principles model (converging in a suitable *weak topology*). This process naturally leads to questions about the role of weak form equation learning in coarse-graining applications. For a first-order stochastic IPS, WSINDy can discover the governing PDE that corresponds to its mean field McKean-Vlasov process based on histograms of discrete-time IPS samples at the \(N\)-particle level [4]. And in the context of diffusive transport with a highly oscillatory spatially-varying diffusivity, WSINDy similarly identifies the correct homogenized equation [4]. Figure 3a depicts histograms (in gray) from an \(N\)-particle system that diffuses with a large but finite spatial frequency \(\omega\), from which WSINDy can identify the correct \(N\rightarrow\infty\), \(\omega\rightarrow \infty\) homogenized system (the learned system is in teal).

For nearly-periodic Hamiltonian systems, WSINDy robustly identifies the correct leading-order Hamiltonian dynamics of reduced dimension that result from averaging around an associated periodic flow that commutes with the full dynamics to leading order [6]. In Figure 3b, noisy observations from an eight-dimensional coupled charged particle system (in white) enable the identification of a four-dimensional coarse-grained Hamiltonian system (in blue), complete with accurate identification of the ambient electric field \(\hat{V}_{\mathbf{E}}\) (background contours).

#### Future Opportunities

This article seeks to highlight the successes and opportunities of weak form methods; notable recent works suggest that many more advances are yet to be made. Computationally, the narrow-fit and trimming approach in WeakIdent can improve sparse regression [11]. On the theoretical side, we see both a novel proof of convergence (in a reproducing kernel Hilbert space) of WSINDy-created surrogate models [9] and an asymptotic result that finds the model classes for which the correct model will be recovered with probability \(1\) [5].

Finally, we note that all of the advances in this article are based on conventional techniques of statistics, applied analysis, and numerical analysis. Applications of these techniques yield versions of equation learning, parameter inference, and model coarse graining that offer substantial robustness and accuracy and demonstrate the weak form’s broad utility beyond well-known theoretical and computational methods.

*An expanded version of this article with a more complete set of references is available online [7], and the code with which to reproduce the results is accessible on our group’s webpage. *

**Acknowledgments: **This work is supported in part by National Science Foundation grants 2054085 and 2109774, National Institute of General Medical Sciences grant R35GM149335, National Institute of Food and Agriculture grant 2019-67014-29919, and Department of Energy grant DE-SC0023346.

References

[1] Bortz, D.M., Messenger, D.A., & Dukic, V. (2023). Direct estimation of parameters in ODE models using WENDy: Weak-form estimation of nonlinear dynamics.Bull. Math. Biol.,85, 110.

[2] Brunton, S.L., Proctor, J.L., & Kutz, J.N. (2016). Discovering governing equations from data by sparse identification of nonlinear dynamical systems.Proc. Natl. Acad. Sci.,113(15), 3932-3937.

[3] Messenger, D.A., & Bortz, D.M. (2021). Weak SINDy for partial differential equations.J. Comput. Phys.,443, 110525.

[4] Messenger, D.A., & Bortz, D.M. (2022). Learning mean-field equations from particle data using WSINDy.Physica D,439, 133406.

[5] Messenger, D.A., & Bortz, D.M. (2024). Asymptotic consistency of the WSINDy algorithm in the limit of continuum data.IMA J. Numer. Anal.To be published.

[6] Messenger, D.A., Burby, J.W., & Bortz, D.M. (2024). Coarse-graining Hamiltonian systems using WSINDy.Sci. Rep.,14(1), 14457.

[7] Messenger, D.A., Tran, A., Dukic, V., & Bortz, D.M. (2024). The weak form is stronger than you think. Preprint,arXiv:2409.06751.

[8] Messenger, D.A., Wheeler, G.E., Liu, X., & Bortz, D.M. (2022). Learning anisotropic interaction rules from individual trajectories in a heterogeneous cellular population.J. R. Soc. Interface,19(195), 20220412.

[9] Russo, B.P., & Laiu, M.P. (2024). Convergence of weak-SINDy surrogate models.SIAM J. Appl. Dyn. Syst.,23(2), 1017-1051.

[10] Shinbrot, M. (1954).On the analysis of linear and nonlinear dynamical systems for transient-response data(National Advisory Committee for Aeronautics technical note 3288). Moffett Field, CA: Ames Aeronautical Laboratory.

[11] Tang, M., Liao, W., Kuske, R., & Kang, S.H. (2023). WeakIdent: Weak formulation for identifying differential equation using narrow-fit and trimming.J. Comput. Phys.,483, 112069.

[12] Tran, A., He, X., Messenger, D.A., Choi, Y., & Bortz, D.M. (2024). Weak-form latent space dynamics identification.Comput. Methods Appl. Mech. Eng.,427(15), 116998.

### About the Authors

#### Daniel Messenger

##### Director’s Postdoctoral Fellow, Los Alamos National Laboratory

Daniel Messenger is a Director’s Postdoctoral Fellow at Los Alamos National Laboratory.

#### April Tran

##### Rudy Horne Graduate Fellow, University of Colorado, Boulder

April Tran is a Rudy Horne Graduate Fellow in the Department of Applied Mathematics at the University of Colorado, Boulder.

#### Vanja Dukic

##### Professor, University of Colorado, Boulder

Vanja Dukic is a professor in the Department of Applied Mathematics at the University of Colorado, Boulder.

#### David Bortz

##### Professor, University of Colorado, Boulder

David Bortz is a professor in the Department of Applied Mathematics at the University of Colorado, Boulder.