Abstract
Data-driven discovery of partial differential equations (PDEs) is a promising approach for uncovering the underlying laws governing complex systems. However, purely data-driven techniques face the dilemma of balancing search space with optimization efficiency. This study introduces a knowledge-guided approach that incorporates existing PDEs documented in a mathematical handbook to facilitate the discovery process. These PDEs are encoded as sentence-like structures composed of operators and basic terms, and used to train a generative model, called EqGPT, which enables the generation of free-form PDEs. A loop of “generation–evaluation–optimization” is constructed to autonomously identify the most suitable PDE. Experimental results demonstrate that this framework can recover a variety of PDE forms with high accuracy and computational efficiency, particularly in cases involving complex temporal derivatives or intricate spatial terms, which are often beyond the reach of conventional methods. The approach also exhibits generalizability to irregular spatial domains and higher dimensional settings. Notably, it succeeds in discovering a previously unreported PDE governing strongly nonlinear surface gravity waves propagating toward breaking, based on real-world experimental data, highlighting its applicability to practical scenarios and its potential to support scientific discovery.
| Original language | English |
|---|---|
| Article number | 10255 |
| Number of pages | 16 |
| Journal | Nature Communications |
| Volume | 16 |
| Issue number | 1 |
| Early online date | 21 Nov 2025 |
| DOIs | |
| Publication status | Published - 21 Nov 2025 |
Bibliographical note
Publisher Copyright:© The Author(s) 2025.
Funding
This work was supported and partially funded by the National Natural Science Foundation of China (grant no. 52288101, 12501744, and 12572266), the China Postdoctoral Science Foundation (grant no. 2024M761535), the China National Postdoctoral Program for Innovative Talents (Grant No. BX20250063), and the National Key Research and Development Program 2024YFF1500600. The experimental results used here were funded by the UK Natural Environment Research Council (grant no. NE/T000309/1) awarded to A.H.C. This work was supported by the High Performance Computing Centers at the Eastern Institute of Technology, Ningbo, and the Ningbo Institute of Digital Twin.