The Allure of AI for Numerical Simulations

Can we afford to let go of mathematical guarantees for engineering?

May 01, 2025

Let’s say we have a sequence 1,1,1,1,…. up to infinity. The only rule that describes the sequence is, ‘Choose any number, the sequence value of that position number will be 1’. For example, if you choose n=500, then that 500th entry of the sequence is 1. Based on this definition one can mathematically prove that this sequence tends to 1 or converges to 1. This is the essence of calculus. Calculus allows us to model the real world, and build interesting things around us like cars, bridges, and airplanes. We can rely on them because we know how the limits are behaving and we can even put bounds on them. For example, consider the sequence {1/n}. If you choose n=50, the value of that position is 1/50. What can we say about this sequence? Again, one can prove it tends to 0, and one can even guarantee that the values after approximating the sequence with say only up to n=50, all the later values will be smaller than 1/50. Using mathematical proofs humans have defined the notion of convergence, which is at the core of our classical numerical simulation technologies, and makes them reliable.

Recent AI-driven surrogate models promise speedups of four to five orders of magnitude over traditional numerical solvers, yet many lack theoretical convergence guarantees, hide stability failures in simulation benchmarks, and rely on selective comparisons that inflate performance claims. Industry uptake of these methods has faced a productivity paradox, with extensive deployments failing to yield proportional returns and encountering ethical, legal, and technical pitfalls. To avoid mass‐adoption risks, practitioners must demand rigorous validation, transparent reporting of convergence behavior, and unbiased benchmarking before embracing AI for numerical solutions.

The Hype Cycle in Industry

Despite high expectations, only about 1% of companies have scaled AI projects to production with significant ROI. Organizations often invest heavily yet see minimal productivity gains, a “productivity paradox” echoing past technological booms.

Pitfall 1: Misleading Performance Metrics

Convergence vs “Resolution Invariance”

Recent works have rebranded traditional convergence analyses as “resolution‐invariance” studies, obscuring the fact that they do not demonstrate true numerical convergence. This renaming can mislead readers into overestimating the robustness of AI methods.

Case Study: Neural Operators

The Nature Reviews Physics perspective on neural operators touts superior accuracy at finer discretizations but does not compare against convergent high‐fidelity solvers over a mesh‐refinement study. Without such benchmarks, claims of “resolution invariance” may simply reflect fixed‐grid performance improvements rather than genuine convergence.

Pitfall 2: Lack of Theoretical Guarantees

Convergence Issues in PINNs

Physics‐Informed Neural Networks (PINNs) optimize solutions by minimizing error residuals of a simulated model, yet they often lack provable mathematical convergence that they are actually solving the model with a notion of convergence to the correct solution. Rigorous analyses show that PINNs can stagnate in local minima, with no guarantee of approaching the true solution as training progresses.

Simulation Failures and Non‐Convergence

A survey of current practices reveals that a significant fraction of simulation studies fail to converge or produce valid outputs, yet these failures are seldom reported. Ignoring non‐convergent runs skews performance metrics and paints an overly optimistic picture of AI surrogates.

Pitfall 3: Selective Benchmarking

Comparing AI Only to AI

In some high‐profile works, real numerical errors are compared only against other AI methods. Which themselves fail to converge, rather than state‐of‐the‐art solvers. This circular benchmarking drastically overstates AI performance.

ML-Accelerated CFD with Hidden Discrepancies

The PNAS study on ML-accelerated CFD reports 8–16× finer‐resolution equivalence and 40–400× speedups, yet detailed mesh‐refinement and stability analyses are absent. Without those, it remains unclear whether the AI model truly captures all scales of fluid dynamics or simply interpolates within training regimes.

Industry Adoption: Cautionary Tales

Productivity Paradox and ROI

McKinsey and Stanford data show that while 78% of companies use AI, only 1% achieve significant scaling and return on investment. Many organizations report pilot fatigue as projects stall without robust validation or clear KPIs.

Enterprise Pitfalls and Governance Needs

Experts warn of legal and ethical risks, bias, data privacy, and regulatory non‐compliance that can derail AI deployments. Proactive governance, auditing, and explainability are essential to avoid costly fines and reputational damage.

A Call for Rigorous Validation

Before declaring AI as a wholesale replacement for numerical solvers, the community must:

Demand Convergence Proofs: Insist on mesh‐refinement studies and theoretical guarantees of convergence, not just single‐grid performance.
Report Failures Transparently: Publish non‐convergent runs and error distributions alongside success cases to avoid survivorship bias.
Benchmark Fairly: Compare AI models against top‐tier traditional solvers under identical conditions, not just against other AI methods.
Implement Governance: Establish clear metrics, audits, and compliance frameworks to ensure safe and reliable AI adoption.

Only through such rigor can we harness AI’s promise for numerical solutions without falling prey to deceptive hype.

References

ASim-AI Newsletter

Discussion about this post