Skip to content

Commit 8ff9db8

Browse files
committed
Add FP64 precision rationale to verification chapter
Explain why all solvers default to dtype=np.float64: the round-off error floor of FP32 limits convergence studies to 2-3 refinement levels, while FP64 provides 5-7 levels needed to robustly establish asymptotic convergence rates. Add Roy (2005) and Oberkampf & Roy (2010) references.
1 parent dbb8799 commit 8ff9db8

196 files changed

Lines changed: 66 additions & 13176 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

chapters/devito_intro/verification.qmd

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,48 @@ $$
3333
If the measured rate matches the theoretical order, we have strong
3434
evidence the implementation is correct.
3535

36+
### Floating-Point Precision and Convergence Testing {#sec-verification-fp64}
37+
38+
Convergence rate testing requires measuring how the error $E(\Delta x)$
39+
decreases across several grid refinements. The total error has two components:
40+
$$
41+
E_{\text{total}} = C \Delta x^p + E_{\text{roundoff}}
42+
$$ {#eq-verification-total-error}
43+
As $\Delta x \to 0$, the discretization term $C \Delta x^p$ shrinks but
44+
eventually reaches the **round-off error floor** set by the floating-point
45+
precision. The machine epsilon for single precision (FP32) is approximately
46+
$1.2 \times 10^{-7}$, while for double precision (FP64) it is approximately
47+
$2.2 \times 10^{-16}$.
48+
49+
For a second-order scheme ($p=2$), the discretization error is $O(\Delta x^2)$.
50+
Consider the number of useful refinement levels before round-off dominates:
51+
52+
| Grid points | $\Delta x$ | $O(\Delta x^2)$ | FP32 resolved? | FP64 resolved? |
53+
|:-----------:|:----------:|:----------------:|:--------------:|:--------------:|
54+
| 10 | $10^{-1}$ | $10^{-2}$ | Yes | Yes |
55+
| 100 | $10^{-2}$ | $10^{-4}$ | Yes | Yes |
56+
| 1000 | $10^{-3}$ | $10^{-6}$ | Marginal | Yes |
57+
| 10000 | $10^{-4}$ | $10^{-8}$ | No | Yes |
58+
59+
: Grid refinement levels resolvable in single vs double precision for a second-order scheme. {#tbl-fp-refinement}
60+
61+
With FP32, only 2--3 useful refinement levels are available before round-off
62+
noise corrupts the convergence rate estimate. With FP64, 5--7 levels are
63+
available --- enough to robustly establish the asymptotic convergence rate.
64+
Roy [@roy2005] emphasises that the observed order of accuracy is only meaningful
65+
when the solution is in the **asymptotic range**, where higher-order error terms
66+
are negligible. FP32 often lacks the headroom to both reach the asymptotic
67+
range and have sufficient refinement levels before hitting the round-off floor.
68+
69+
::: {.callout-important}
70+
## Default precision for verification
71+
72+
All solver functions in this book default to `dtype=np.float64` (double
73+
precision) because code verification requires measuring errors across
74+
many orders of magnitude. Users may pass `dtype=np.float32` for
75+
production runs where throughput matters more than verification precision.
76+
:::
77+
3678
### Implementing a Convergence Test
3779
3880
{{< include snippets/verification_convergence_wave.qmd >}}

chapters/vib/exer-vib/binary_star.py

Lines changed: 0 additions & 95 deletions
This file was deleted.

chapters/vib/exer-vib/bouncing_ball.py

Lines changed: 0 additions & 59 deletions
This file was deleted.

chapters/vib/exer-vib/elastic_pendulum.py

Lines changed: 0 additions & 147 deletions
This file was deleted.

0 commit comments

Comments
 (0)