|
33 | 33 | If the measured rate matches the theoretical order, we have strong |
34 | 34 | evidence the implementation is correct. |
35 | 35 |
|
| 36 | +### Floating-Point Precision and Convergence Testing {#sec-verification-fp64} |
| 37 | + |
| 38 | +Convergence rate testing requires measuring how the error $E(\Delta x)$ |
| 39 | +decreases across several grid refinements. The total error has two components: |
| 40 | +$$ |
| 41 | +E_{\text{total}} = C \Delta x^p + E_{\text{roundoff}} |
| 42 | +$$ {#eq-verification-total-error} |
| 43 | +As $\Delta x \to 0$, the discretization term $C \Delta x^p$ shrinks but |
| 44 | +eventually reaches the **round-off error floor** set by the floating-point |
| 45 | +precision. The machine epsilon for single precision (FP32) is approximately |
| 46 | +$1.2 \times 10^{-7}$, while for double precision (FP64) it is approximately |
| 47 | +$2.2 \times 10^{-16}$. |
| 48 | +
|
| 49 | +For a second-order scheme ($p=2$), the discretization error is $O(\Delta x^2)$. |
| 50 | +Consider the number of useful refinement levels before round-off dominates: |
| 51 | +
|
| 52 | +| Grid points | $\Delta x$ | $O(\Delta x^2)$ | FP32 resolved? | FP64 resolved? | |
| 53 | +|:-----------:|:----------:|:----------------:|:--------------:|:--------------:| |
| 54 | +| 10 | $10^{-1}$ | $10^{-2}$ | Yes | Yes | |
| 55 | +| 100 | $10^{-2}$ | $10^{-4}$ | Yes | Yes | |
| 56 | +| 1000 | $10^{-3}$ | $10^{-6}$ | Marginal | Yes | |
| 57 | +| 10000 | $10^{-4}$ | $10^{-8}$ | No | Yes | |
| 58 | +
|
| 59 | +: Grid refinement levels resolvable in single vs double precision for a second-order scheme. {#tbl-fp-refinement} |
| 60 | +
|
| 61 | +With FP32, only 2--3 useful refinement levels are available before round-off |
| 62 | +noise corrupts the convergence rate estimate. With FP64, 5--7 levels are |
| 63 | +available --- enough to robustly establish the asymptotic convergence rate. |
| 64 | +Roy [@roy2005] emphasises that the observed order of accuracy is only meaningful |
| 65 | +when the solution is in the **asymptotic range**, where higher-order error terms |
| 66 | +are negligible. FP32 often lacks the headroom to both reach the asymptotic |
| 67 | +range and have sufficient refinement levels before hitting the round-off floor. |
| 68 | +
|
| 69 | +::: {.callout-important} |
| 70 | +## Default precision for verification |
| 71 | +
|
| 72 | +All solver functions in this book default to `dtype=np.float64` (double |
| 73 | +precision) because code verification requires measuring errors across |
| 74 | +many orders of magnitude. Users may pass `dtype=np.float32` for |
| 75 | +production runs where throughput matters more than verification precision. |
| 76 | +::: |
| 77 | +
|
36 | 78 | ### Implementing a Convergence Test |
37 | 79 |
|
38 | 80 | {{< include snippets/verification_convergence_wave.qmd >}} |
|
0 commit comments