Skip to content

Commit 853bc9d

Browse files
Support for CBLAS
1 parent 25b21ce commit 853bc9d

1,349 files changed

Lines changed: 449320 additions & 0 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CBLAS/Makefile

Lines changed: 4932 additions & 0 deletions
Large diffs are not rendered by default.

CBLAS/doc/TOLERANCES_BY_MODE.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Step size (h) and tolerances (atol, rtol) by test mode
2+
3+
Generated from `run_tapenade_cblas.py` test generators. Pass criterion: `|error| <= atol + rtol * |reference|`.
4+
5+
## Consistency by category
6+
7+
| Mode | Consistent? | Exceptions |
8+
|------|-------------|------------|
9+
| **\_d** | **Yes** | All _d tests use the same h, atol, rtol (single: 1e-3, 2e-3, 2e-3; double: 1e-6, 1e-5, 1e-5). |
10+
| **\_dv** | **Yes** | All _dv tests use the same values (single: 1e-3, 5e-3, 5e-3; double: 1e-6, 1e-5, 1e-5). |
11+
| **\_b** | **No** | **nrm2** (dnrm2_b, snrm2_b): single atol=rtol=**2.0e-3**; all others: single atol=rtol=1.0e-2. Double and h are the same. |
12+
| **\_bv** | **No** | **Scalar-result** (dasum_bv, sasum_bv, ddot_bv, sdot_bv, dnrm2_bv, snrm2_bv): double **h=1.0e-6** and single atol=rtol=**2.0e-3**; generic/gemm _bv: double h=1.0e-7, single atol=rtol=1.0e-2. |
13+
14+
## Full table
15+
16+
| Mode | Category | Precision | h (step size) | atol | rtol |
17+
|------|----------|-----------|---------------|------|------|
18+
| **\_d** | Forward scalar | Single (s, c) | 1.0e-3 | 2.0e-3 | 2.0e-3 |
19+
| **\_d** | Forward scalar | Double (d, z) | 1.0e-6 | 1.0e-5 | 1.0e-5 |
20+
| **\_dv** | Forward vector | Single (s, c) | 1.0e-3 | 5.0e-3 | 5.0e-3 |
21+
| **\_dv** | Forward vector | Double (d, z) | 1.0e-6 | 1.0e-5 | 1.0e-5 |
22+
| **\_b** | Reverse scalar (generic, gemm) | Single (s, c) | 1.0e-3 | 1.0e-2 | 1.0e-2 |
23+
| **\_b** | Reverse scalar (generic, gemm) | Double (d, z) | 1.0e-7 | 1.0e-5 | 1.0e-5 |
24+
| **\_b** | Reverse scalar (nrm2 only) | Single (s, c) | 1.0e-3 | **2.0e-3** | **2.0e-3** |
25+
| **\_b** | Reverse scalar (nrm2 only) | Double (d, z) | 1.0e-7 | 1.0e-5 | 1.0e-5 |
26+
| **\_bv** | Reverse vector (generic VJP, gemm) | Single (s, c) | 1.0e-3 | 1.0e-2 | 1.0e-2 |
27+
| **\_bv** | Reverse vector (generic VJP, gemm) | Double (d, z) | 1.0e-7 | 1.0e-5 | 1.0e-5 |
28+
| **\_bv** | Reverse vector (scalar-result: dasum, ddot, nrm2, etc.) | Single (s, c) | 1.0e-3 | **2.0e-3** | **2.0e-3** |
29+
| **\_bv** | Reverse vector (scalar-result) | Double (d, z) | **1.0e-6** | 1.0e-5 | 1.0e-5 |
30+
31+
## Notes
32+
33+
- **\_d**: Matches Fortran BLAS forward tests (e.g. `test_sgemm.f90` / `test_dgemm.f90`).
34+
- **\_dv**: Same h as _d; atol/rtol slightly looser for single precision (5.0e-3) for multi-direction FD.
35+
- **\_b** / **\_bv** (generic): VJP check; smaller h (1.0e-7 for double) for central-difference stability; looser single-precision tolerances (1.0e-2).
36+
- **\_bv** (nrm2-style): Used for scalar-result routines (e.g. snrm2_bv, dnrm2_bv); h and atol/rtol aligned with nrm2 _b/_dv tests.
37+
- **nrm2 _b** (reverse scalar): Same as generic _b (h=1.0e-7 double / 1.0e-3 float; atol=rtol=1.0e-5 double / 1.0e-2 float). **nrm2 _d** uses h=1.0e-7 double, atol=rtol=1.0e-5; single uses h=1.0e-3, atol=rtol=2.0e-3.

CBLAS/include/DIFFSIZES.f90

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
MODULE DIFFSIZES
2+
Implicit None
3+
integer, parameter :: nbdirsmax=4
4+
integer, parameter :: ISIZE1OFsx=4
5+
integer, parameter :: ISIZE1OFcx=4
6+
integer, parameter :: ISIZE1OFap=4
7+
integer, parameter :: ISIZE1OFzy=4
8+
integer, parameter :: ISIZE1OFsy=4
9+
integer, parameter :: ISIZE1OFdy=4
10+
integer, parameter :: ISIZE1OFzx=4
11+
integer, parameter :: ISIZE1OFcy=4
12+
integer, parameter :: ISIZE1OFdx=4
13+
integer, parameter :: ISIZE2OFa=4
14+
integer, parameter :: ISIZE1OFx=4
15+
integer, parameter :: ISIZE2OFb=4
16+
integer, parameter :: ISIZE1OFy=4
17+
END MODULE DIFFSIZES

CBLAS/include/DIFFSIZESC.inc

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
#ifndef DIFFSIZESC_INCLUDED
2+
#define DIFFSIZESC_INCLUDED
3+
#ifndef NBDirsMax
4+
#define NBDirsMax 4
5+
#endif
6+
#endif

CBLAS/include/DIFFSIZESF.inc

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
integer nbdirsmax
2+
parameter (nbdirsmax=4)
3+
integer ISIZE1OFsx
4+
parameter (ISIZE1OFsx=4)
5+
integer ISIZE1OFcx
6+
parameter (ISIZE1OFcx=4)
7+
integer ISIZE1OFap
8+
parameter (ISIZE1OFap=4)
9+
integer ISIZE1OFzy
10+
parameter (ISIZE1OFzy=4)
11+
integer ISIZE1OFsy
12+
parameter (ISIZE1OFsy=4)
13+
integer ISIZE1OFdy
14+
parameter (ISIZE1OFdy=4)
15+
integer ISIZE1OFzx
16+
parameter (ISIZE1OFzx=4)
17+
integer ISIZE1OFcy
18+
parameter (ISIZE1OFcy=4)
19+
integer ISIZE1OFdx
20+
parameter (ISIZE1OFdx=4)
21+
integer ISIZE2OFa
22+
parameter (ISIZE2OFa=4)
23+
integer ISIZE1OFx
24+
parameter (ISIZE1OFx=4)
25+
integer ISIZE2OFb
26+
parameter (ISIZE2OFb=4)
27+
integer ISIZE1OFy
28+
parameter (ISIZE1OFy=4)

CBLAS/include/cblas_b.h

Lines changed: 486 additions & 0 deletions
Large diffs are not rendered by default.

CBLAS/include/cblas_bv.h

Lines changed: 927 additions & 0 deletions
Large diffs are not rendered by default.

CBLAS/include/cblas_d.h

Lines changed: 492 additions & 0 deletions
Large diffs are not rendered by default.

CBLAS/include/cblas_dv.h

Lines changed: 492 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)