Skip to content

Commit 935b8c5

Browse files
sriharikrishnaamontoison
authored andcommitted
Update README.md
Add to the readme
1 parent e2026b0 commit 935b8c5

1 file changed

Lines changed: 136 additions & 44 deletions

File tree

README.md

Lines changed: 136 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -1,69 +1,161 @@
1-
# diff-lapack
1+
# `diffblas`
22

3-
## Installation
3+
`diffblas`is a library that provides (algorithmically) differentiated BLAS routines from their reference implementation in [lapack](https://github.com/Reference-LAPACK/lapack) using the automatic differentiation tool [Tapenade](https://gitlab.inria.fr/tapenade/tapenade) in four modes: forward (`_d`), vector forward (`_dv`), reverse (`_b`), and vector reverse (`_bv`).
4+
The compiled `libdiffblas` can be linked into applications that need derivatives of BLAS operations for optimization, sensitivity analysis etc.
45

5-
We support building `diffblas` with the [Meson Build system](https://mesonbuild.com):
6+
This work was inspired in part by a need to differentiate a Fortran code [HFBTHO](https://www.sciencedirect.com/science/article/abs/pii/S0010465525004564), that uses LAPACK and BLAS routines, and to use the differentiated application for gradient-based optimization.
7+
8+
## Using the pre-compiled library from your application
9+
10+
Use the library the same way the tests do:
11+
1. Call the differentiated routines from your application.
12+
2. Include the right header/module.
13+
3. Link your application with `libdiffblas` and the BLAS library (`refblas`).
14+
15+
16+
**Calling convention (same as in the tests):**
17+
18+
1. **Forward (`_d`):** Declare the differentiated routine as `external` (or use a module). Call it with the same arguments as the original BLAS routine, plus one extra “derivative” argument per floating-point input/output (e.g. `a`, `a_d`). Example for DGEMM:
19+
```fortran
20+
external :: dgemm_d
21+
! ... set transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc
22+
! ... set alpha_d, a_d, b_d, beta_d, c_d (derivative inputs; c_d is output)
23+
call dgemm_d(transa, transb, m, n, k, alpha, alpha_d, a, a_d, lda, b, b_d, ldb, beta, beta_d, c, c_d, ldc)
24+
```
25+
2. **Reverse (`_b`):** Call the reverse routine with the same shape as the forward call; the “b” arguments carry adjoints (see generated test files such as `BLAS/test/test_dgemm_reverse.f90`).
26+
3. **Vector forward (`_dv`) / vector reverse (`_bv`):** Same idea with an extra dimension for multiple directions; see `BLAS/test/test_*_vector_forward.f90` and `test_*_vector_reverse.f90`.
27+
28+
***Handing dimensions :***
29+
BLAS uses assumed size arrays as seen in `dgemm.f`.
30+
```fortran
31+
DOUBLE PRECISION A(LDA,*),B(LDB,*),C(LDC,*)
32+
```
33+
34+
In order to enable the use of the differentiated versions of these routines in a portable way, we require you to call addtional routines before calling the differentiated rouint. For **reverse** (`_b`) and **vector reverse** (`_bv`) — and in some cases **vector forward** (`_dv`) the differentiated code expects certain "ISIZE" values to be set before the call. These describe the dimensions of the assumed size arrays and matrices.
35+
36+
For example (as in `BLAS/test/test_dgemm_reverse.f90`): for `dgemm_b`, set the second dimension of the two matrix arguments with the appropriate dimensions:
37+
38+
```fortran
39+
use DIFFSIZES
40+
! ...
41+
call set_ISIZE2OFA(lda) ! leading dimension of A
42+
call set_ISIZE2OFB(ldb) ! leading dimension of B
43+
call dgemm_b(transa, transb, m, n, k, alpha, alphab, a, ab, lda, b, bb, ldb, beta, betab, c, cb, ldc)
44+
call set_ISIZE2OFA(-1)
45+
call set_ISIZE2OFB(-1)
46+
```
47+
48+
Matrix routines often use `set_ISIZE2OFA` / `set_ISIZE2OFB`; level-1 with vectors may use `set_ISIZE1OFX`, `set_ISIZE1OFY`, or `set_ISIZE1OFZx` / `set_ISIZE1OFZy`). The tests in `BLAS/test/` show the exact `set_ISIZE*` calls for each routine. If a setter is not called, the differentiated will stop with an error. You can pptionally reset the dimensions by calling set_ISIZE*(-1)` so that the next use must set the dimension again. This is recommended if you reuse the same differentiated routine in multiple contexts.
49+
50+
***Handing directions in vector mode:***
51+
We have currently hardcoded the number of directions to 4. In case you need a different number, you should download the source of this repository and edit both `BLAS/include/DIFFSIZES.inc` and `BLAS/include/DIFFSIZES.F90`. Then build the library to generate the appropriate library.
52+
```fortran
53+
INTEGER, PARAMETER :: nbdirsmax = 4
54+
```
55+
```fortran
56+
integer nbdirsmax
57+
parameter (nbdirsmax=4)
58+
```fortran
59+
60+
61+
**Minimal example (forward mode, one routine):** Compile your main program and link with the built library and BLAS:
662
763
```bash
8-
meson setup builddir
9-
meson compile -C builddir
10-
meson install -C builddir
64+
gfortran -O2 -I/path/to/diff-lapack/BLAS/include -c my_main.f90 -o my_main.o
65+
gfortran -o my_main my_main.o /path/to/diff-lapack/builddir/libdiffblas.a -L$LAPACKDIR -lrefblas
1166
```
1267

13-
By default, Meson compiles a static libary `libdiffblas.a`.
14-
For a shared library, please use the option `-Ddefault_library=shared`.
15-
To install in a specific location, use the option `--prefix=install/dir`
68+
(If you built with Make, use `BLAS/build/libdiffblas_d.a` for forward mode only.) The tests in `BLAS/test/` (e.g. `test_dgemm.f90`, `test_dgemm_reverse.f90`) are full examples of how to declare and call the differentiated routines and how to link; use them as templates for your application.
69+
70+
## Building the library (and tests)
71+
72+
You need **pre-generated** sources (from step 1). The build compiles them and links with a BLAS library and the Tapenade adStack runtime.
73+
74+
### Build with Meson (library and tests)
75+
76+
**Dependencies:**
77+
78+
- **Fortran compiler** (e.g. gfortran, ifort, ifx) and **C compiler** (e.g. gcc).
79+
- **LAPACK installation** — a built Reference LAPACK (or compatible) providing BLAS (e.g. `librefblas.a` or `libblas.a`). Set **`LAPACKDIR`** (or equivalent) so Meson can find it (see below).
80+
- **Tapenade adStack** — the repo already contains `TAPENADE/adStack.c` and `TAPENADE/include/`; Meson compiles and links these automatically. No separate Tapenade install is required for the build.
81+
82+
**Configure and build from the project root:**
1683

1784
```bash
18-
meson setup builddir -Ddefault_library=shared --prefix=$(pwd)/diffblas
85+
# If BLAS is in a custom location (e.g. LAPACK build dir), pass library search path
86+
meson setup builddir -Dlibblas=refblas -Dlibblas_path=/path/to/lapack/build
87+
88+
# Or rely on system / environment (e.g. LIBRARY_PATH, or refblas in default path)
89+
meson setup builddir -Dlibblas=refblas
90+
1991
meson compile -C builddir
20-
meson install -C builddir
2192
```
2293

23-
## Launching the differentiation
94+
This produces `builddir/libdiffblas.a` (or shared library if configured with `-Ddefault_library=shared`). The Meson build uses `BLAS/meson.build`, compiles everything in `BLAS/src/` plus `BLAS/include/DIFFSIZES.f90`, `BLAS/src/DIFFSIZES_access.f`, and `TAPENADE/adStack.c`; it does **not** build the test executables (tests are built by the BLAS Makefile in 2b if you use that).
2495

25-
Get the latest version from the LAPACK webpage
96+
To also run the tests, use the BLAS Makefile (2b) or run the test programs built there. To install the library:
2697

27-
```shell
28-
wget -q -O tmp.html https://raw.githubusercontent.com/Reference-LAPACK/lapack/refs/heads/master/README.md
29-
VERSION=`cat tmp.html | grep "VERSION" | tail -1 | awk '{print $3}'`
30-
rm tmp.html
98+
```bash
99+
meson install -C builddir --prefix /your/install
31100
```
32-
And use this to create the link to the most recent LAPACK release
33101

34-
```shell
35-
wget -P tmp/ https://github.com/Reference-LAPACK/lapack/archive/refs/tags/v$VERSION.tar.gz
36-
tar xvzf tmp/v$VERSION.tar.gz -C tmp/
37-
rm tmp/v$VERSION.tar.gz
38-
python run_tapenade_blas.py --input-dir tmp/lapack-$VERSION/BLAS/SRC/ --out-dir out
39-
```
102+
### Build with Make (library and tests)
103+
104+
**Dependencies:**
40105

41-
These commands are gathered in a bash file `get_and_run.sh` calling it directly from bash
106+
- **Fortran compiler** (e.g. gfortran) and **C compiler** (e.g. gcc).
107+
- **LAPACK installed** — build and install [Reference LAPACK](https://github.com/Reference-LAPACK/lapack) so that you have `librefblas` (or set `BLAS_LIB` in the Makefile to point at your BLAS).
108+
- **`LAPACKDIR`** — set to the directory where LAPACK was built (or where `librefblas.a` lives), so the linker can find `-lrefblas`:
109+
```bash
110+
export LAPACKDIR=/path/to/lapack/build
111+
```
112+
- **Tapenade adStack** — the Makefile looks for `adStack.c` in this order: `BLAS/src/adStack.c`, then `../TAPENADE/adStack.c`, then `$TAPENADEDIR/ADFirstAidKit/adStack.c`. The repo’s `TAPENADE/` copy is used by default; no Tapenade install needed if you do not override.
42113

43-
Some remarks:
44-
-------------
114+
**Build from the BLAS directory:**
45115

46-
* If one needs to run only a subset of the BLAS files:
47-
```shell
48-
python run_tapenade_blas.py --input-dir tmp/lapack-$VERSION/BLAS/SRC/ --out-dir out --file FILE1 FILE2
116+
```bash
117+
cd BLAS
118+
export LAPACKDIR=/path/to/your/lapack/build # or wherever librefblas is
119+
make
49120
```
50-
* The script has been tested for Python3. One could need to adapt the command line with the python version depending on your system.
51121

52-
## Calling the tests
122+
This builds per-mode static libraries (`build/libdiffblas_d.a`, `libdiffblas_b.a`, `libdiffblas_dv.a`, `libdiffblas_bv.a`) and test executables in `build/`. Run tests:
53123

54-
```shell
55-
cd out/
56-
make # Will compile everything
57-
./run_tests.sh # will run all available tests
124+
```bash
125+
./run_tests.sh
126+
# or run individual test executables, e.g. build/test_dgemm, build/test_dgemm_reverse
58127
```
59128

60-
Similarly, one can limit to certain functions only:
61-
```shell
62-
make dgemm # Will compile only dgemm
129+
## Creating the library sources with run_tapenade_blas.py
130+
131+
This step **generates** the differentiated Fortran sources and test programs (e.g. under `BLAS/src/`, `BLAS/test/`, or a custom output directory like `out/`).
132+
133+
**Dependencies:**
134+
135+
- **LAPACK source tree** — Reference LAPACK (or at least its BLAS SRC directory) so that `--input-dir` points to the BLAS source files (e.g. `lapack-3.x/BLAS/SRC/`).
136+
- **Tapenade** — installed and on your `PATH` (or pass `--tapenade-bin=/path/to/tapenade`).
137+
- **Python 3** — to run `run_tapenade_blas.py`.
138+
139+
**Example: generate from a Reference LAPACK tarball**
140+
141+
```bash
142+
# Download and unpack Reference LAPACK (or set LAPACK_SRC to your tree)
143+
wget -q -O tmp.html https://raw.githubusercontent.com/Reference-LAPACK/lapack/refs/heads/master/README.md
144+
VERSION=$(grep VERSION tmp.html | tail -1 | awk '{print $3}')
145+
rm tmp.html
146+
wget -P tmp/ https://github.com/Reference-LAPACK/lapack/archive/refs/tags/v${VERSION}.tar.gz
147+
tar xzf tmp/v${VERSION}.tar.gz -C tmp/
148+
149+
# Generate differentiated BLAS into BLAS/ (flat layout: src/, test/, include/)
150+
python run_tapenade_blas.py --input-dir tmp/lapack-${VERSION}/BLAS/SRC/ --out-dir BLAS --flat
63151
```
64152

65-
This library generates a differentiated BLAS code using Tapenade in the following modes.
66-
1. Forward mode
67-
2. Vector forward mode
68-
3. Reverse mode
69-
4. Vector reverse mode
153+
**Options:** Use `--file dgemm.f sgemm.f` to restrict to specific routines; `--mode d b` to generate only certain modes; see `python run_tapenade_blas.py --help`.
154+
155+
## Future work
156+
There are several routines within BLAS that are not differentiated with verification, which we plan to add. We will add differentiated versions of the CBLAS routines.
157+
We are beginning to write a differentiated version of the LAPACK library.
158+
159+
## Acknowledgement
160+
This work was supported in part by the Applied Mathematics activity within the U.S. Department of Energy, Office of Science, Office
161+
of Advanced Scientific Computing Research Applied Mathematics, and Office of Nuclear Physics SciDAC program under Contract No. DE-AC02-06CH11357. This work was supported in part by NSF CSSI grant 2104068.

0 commit comments

Comments
 (0)