|
1 | | -# diff-lapack |
| 1 | +# `diffblas` |
2 | 2 |
|
3 | | -## Installation |
| 3 | +`diffblas`is a library that provides (algorithmically) differentiated BLAS routines from their reference implementation in [lapack](https://github.com/Reference-LAPACK/lapack) using the automatic differentiation tool [Tapenade](https://gitlab.inria.fr/tapenade/tapenade) in four modes: forward (`_d`), vector forward (`_dv`), reverse (`_b`), and vector reverse (`_bv`). |
| 4 | +The compiled `libdiffblas` can be linked into applications that need derivatives of BLAS operations for optimization, sensitivity analysis etc. |
4 | 5 |
|
5 | | -We support building `diffblas` with the [Meson Build system](https://mesonbuild.com): |
| 6 | +This work was inspired in part by a need to differentiate a Fortran code [HFBTHO](https://www.sciencedirect.com/science/article/abs/pii/S0010465525004564), that uses LAPACK and BLAS routines, and to use the differentiated application for gradient-based optimization. |
| 7 | + |
| 8 | +## Using the pre-compiled library from your application |
| 9 | + |
| 10 | +Use the library the same way the tests do: |
| 11 | +1. Call the differentiated routines from your application. |
| 12 | +2. Include the right header/module. |
| 13 | +3. Link your application with `libdiffblas` and the BLAS library (`refblas`). |
| 14 | + |
| 15 | + |
| 16 | +**Calling convention (same as in the tests):** |
| 17 | + |
| 18 | +1. **Forward (`_d`):** Declare the differentiated routine as `external` (or use a module). Call it with the same arguments as the original BLAS routine, plus one extra “derivative” argument per floating-point input/output (e.g. `a`, `a_d`). Example for DGEMM: |
| 19 | + ```fortran |
| 20 | + external :: dgemm_d |
| 21 | + ! ... set transa, transb, m, n, k, alpha, a, lda, b, ldb, beta, c, ldc |
| 22 | + ! ... set alpha_d, a_d, b_d, beta_d, c_d (derivative inputs; c_d is output) |
| 23 | + call dgemm_d(transa, transb, m, n, k, alpha, alpha_d, a, a_d, lda, b, b_d, ldb, beta, beta_d, c, c_d, ldc) |
| 24 | + ``` |
| 25 | +2. **Reverse (`_b`):** Call the reverse routine with the same shape as the forward call; the “b” arguments carry adjoints (see generated test files such as `BLAS/test/test_dgemm_reverse.f90`). |
| 26 | +3. **Vector forward (`_dv`) / vector reverse (`_bv`):** Same idea with an extra dimension for multiple directions; see `BLAS/test/test_*_vector_forward.f90` and `test_*_vector_reverse.f90`. |
| 27 | + |
| 28 | +***Handing dimensions :*** |
| 29 | +BLAS uses assumed size arrays as seen in `dgemm.f`. |
| 30 | +```fortran |
| 31 | + DOUBLE PRECISION A(LDA,*),B(LDB,*),C(LDC,*) |
| 32 | +``` |
| 33 | + |
| 34 | +In order to enable the use of the differentiated versions of these routines in a portable way, we require you to call addtional routines before calling the differentiated rouint. For **reverse** (`_b`) and **vector reverse** (`_bv`) — and in some cases **vector forward** (`_dv`) the differentiated code expects certain "ISIZE" values to be set before the call. These describe the dimensions of the assumed size arrays and matrices. |
| 35 | + |
| 36 | +For example (as in `BLAS/test/test_dgemm_reverse.f90`): for `dgemm_b`, set the second dimension of the two matrix arguments with the appropriate dimensions: |
| 37 | + |
| 38 | +```fortran |
| 39 | +use DIFFSIZES |
| 40 | +! ... |
| 41 | +call set_ISIZE2OFA(lda) ! leading dimension of A |
| 42 | +call set_ISIZE2OFB(ldb) ! leading dimension of B |
| 43 | +call dgemm_b(transa, transb, m, n, k, alpha, alphab, a, ab, lda, b, bb, ldb, beta, betab, c, cb, ldc) |
| 44 | +call set_ISIZE2OFA(-1) |
| 45 | +call set_ISIZE2OFB(-1) |
| 46 | +``` |
| 47 | + |
| 48 | +Matrix routines often use `set_ISIZE2OFA` / `set_ISIZE2OFB`; level-1 with vectors may use `set_ISIZE1OFX`, `set_ISIZE1OFY`, or `set_ISIZE1OFZx` / `set_ISIZE1OFZy`). The tests in `BLAS/test/` show the exact `set_ISIZE*` calls for each routine. If a setter is not called, the differentiated will stop with an error. You can pptionally reset the dimensions by calling set_ISIZE*(-1)` so that the next use must set the dimension again. This is recommended if you reuse the same differentiated routine in multiple contexts. |
| 49 | + |
| 50 | +***Handing directions in vector mode:*** |
| 51 | +We have currently hardcoded the number of directions to 4. In case you need a different number, you should download the source of this repository and edit both `BLAS/include/DIFFSIZES.inc` and `BLAS/include/DIFFSIZES.F90`. Then build the library to generate the appropriate library. |
| 52 | +```fortran |
| 53 | +INTEGER, PARAMETER :: nbdirsmax = 4 |
| 54 | +``` |
| 55 | +```fortran |
| 56 | + integer nbdirsmax |
| 57 | + parameter (nbdirsmax=4) |
| 58 | +```fortran |
| 59 | +
|
| 60 | +
|
| 61 | +**Minimal example (forward mode, one routine):** Compile your main program and link with the built library and BLAS: |
6 | 62 |
|
7 | 63 | ```bash |
8 | | -meson setup builddir |
9 | | -meson compile -C builddir |
10 | | -meson install -C builddir |
| 64 | +gfortran -O2 -I/path/to/diff-lapack/BLAS/include -c my_main.f90 -o my_main.o |
| 65 | +gfortran -o my_main my_main.o /path/to/diff-lapack/builddir/libdiffblas.a -L$LAPACKDIR -lrefblas |
11 | 66 | ``` |
12 | 67 |
|
13 | | -By default, Meson compiles a static libary `libdiffblas.a`. |
14 | | -For a shared library, please use the option `-Ddefault_library=shared`. |
15 | | -To install in a specific location, use the option `--prefix=install/dir` |
| 68 | +(If you built with Make, use `BLAS/build/libdiffblas_d.a` for forward mode only.) The tests in `BLAS/test/` (e.g. `test_dgemm.f90`, `test_dgemm_reverse.f90`) are full examples of how to declare and call the differentiated routines and how to link; use them as templates for your application. |
| 69 | + |
| 70 | +## Building the library (and tests) |
| 71 | + |
| 72 | +You need **pre-generated** sources (from step 1). The build compiles them and links with a BLAS library and the Tapenade adStack runtime. |
| 73 | + |
| 74 | +### Build with Meson (library and tests) |
| 75 | + |
| 76 | +**Dependencies:** |
| 77 | + |
| 78 | +- **Fortran compiler** (e.g. gfortran, ifort, ifx) and **C compiler** (e.g. gcc). |
| 79 | +- **LAPACK installation** — a built Reference LAPACK (or compatible) providing BLAS (e.g. `librefblas.a` or `libblas.a`). Set **`LAPACKDIR`** (or equivalent) so Meson can find it (see below). |
| 80 | +- **Tapenade adStack** — the repo already contains `TAPENADE/adStack.c` and `TAPENADE/include/`; Meson compiles and links these automatically. No separate Tapenade install is required for the build. |
| 81 | + |
| 82 | +**Configure and build from the project root:** |
16 | 83 |
|
17 | 84 | ```bash |
18 | | -meson setup builddir -Ddefault_library=shared --prefix=$(pwd)/diffblas |
| 85 | +# If BLAS is in a custom location (e.g. LAPACK build dir), pass library search path |
| 86 | +meson setup builddir -Dlibblas=refblas -Dlibblas_path=/path/to/lapack/build |
| 87 | + |
| 88 | +# Or rely on system / environment (e.g. LIBRARY_PATH, or refblas in default path) |
| 89 | +meson setup builddir -Dlibblas=refblas |
| 90 | + |
19 | 91 | meson compile -C builddir |
20 | | -meson install -C builddir |
21 | 92 | ``` |
22 | 93 |
|
23 | | -## Launching the differentiation |
| 94 | +This produces `builddir/libdiffblas.a` (or shared library if configured with `-Ddefault_library=shared`). The Meson build uses `BLAS/meson.build`, compiles everything in `BLAS/src/` plus `BLAS/include/DIFFSIZES.f90`, `BLAS/src/DIFFSIZES_access.f`, and `TAPENADE/adStack.c`; it does **not** build the test executables (tests are built by the BLAS Makefile in 2b if you use that). |
24 | 95 |
|
25 | | -Get the latest version from the LAPACK webpage |
| 96 | +To also run the tests, use the BLAS Makefile (2b) or run the test programs built there. To install the library: |
26 | 97 |
|
27 | | -```shell |
28 | | -wget -q -O tmp.html https://raw.githubusercontent.com/Reference-LAPACK/lapack/refs/heads/master/README.md |
29 | | -VERSION=`cat tmp.html | grep "VERSION" | tail -1 | awk '{print $3}'` |
30 | | -rm tmp.html |
| 98 | +```bash |
| 99 | +meson install -C builddir --prefix /your/install |
31 | 100 | ``` |
32 | | -And use this to create the link to the most recent LAPACK release |
33 | 101 |
|
34 | | -```shell |
35 | | -wget -P tmp/ https://github.com/Reference-LAPACK/lapack/archive/refs/tags/v$VERSION.tar.gz |
36 | | -tar xvzf tmp/v$VERSION.tar.gz -C tmp/ |
37 | | -rm tmp/v$VERSION.tar.gz |
38 | | -python run_tapenade_blas.py --input-dir tmp/lapack-$VERSION/BLAS/SRC/ --out-dir out |
39 | | -``` |
| 102 | +### Build with Make (library and tests) |
| 103 | + |
| 104 | +**Dependencies:** |
40 | 105 |
|
41 | | -These commands are gathered in a bash file `get_and_run.sh` calling it directly from bash |
| 106 | +- **Fortran compiler** (e.g. gfortran) and **C compiler** (e.g. gcc). |
| 107 | +- **LAPACK installed** — build and install [Reference LAPACK](https://github.com/Reference-LAPACK/lapack) so that you have `librefblas` (or set `BLAS_LIB` in the Makefile to point at your BLAS). |
| 108 | +- **`LAPACKDIR`** — set to the directory where LAPACK was built (or where `librefblas.a` lives), so the linker can find `-lrefblas`: |
| 109 | + ```bash |
| 110 | + export LAPACKDIR=/path/to/lapack/build |
| 111 | + ``` |
| 112 | +- **Tapenade adStack** — the Makefile looks for `adStack.c` in this order: `BLAS/src/adStack.c`, then `../TAPENADE/adStack.c`, then `$TAPENADEDIR/ADFirstAidKit/adStack.c`. The repo’s `TAPENADE/` copy is used by default; no Tapenade install needed if you do not override. |
42 | 113 |
|
43 | | -Some remarks: |
44 | | -------------- |
| 114 | +**Build from the BLAS directory:** |
45 | 115 |
|
46 | | -* If one needs to run only a subset of the BLAS files: |
47 | | -```shell |
48 | | -python run_tapenade_blas.py --input-dir tmp/lapack-$VERSION/BLAS/SRC/ --out-dir out --file FILE1 FILE2 |
| 116 | +```bash |
| 117 | +cd BLAS |
| 118 | +export LAPACKDIR=/path/to/your/lapack/build # or wherever librefblas is |
| 119 | +make |
49 | 120 | ``` |
50 | | -* The script has been tested for Python3. One could need to adapt the command line with the python version depending on your system. |
51 | 121 |
|
52 | | -## Calling the tests |
| 122 | +This builds per-mode static libraries (`build/libdiffblas_d.a`, `libdiffblas_b.a`, `libdiffblas_dv.a`, `libdiffblas_bv.a`) and test executables in `build/`. Run tests: |
53 | 123 |
|
54 | | -```shell |
55 | | -cd out/ |
56 | | -make # Will compile everything |
57 | | -./run_tests.sh # will run all available tests |
| 124 | +```bash |
| 125 | +./run_tests.sh |
| 126 | +# or run individual test executables, e.g. build/test_dgemm, build/test_dgemm_reverse |
58 | 127 | ``` |
59 | 128 |
|
60 | | -Similarly, one can limit to certain functions only: |
61 | | -```shell |
62 | | -make dgemm # Will compile only dgemm |
| 129 | +## Creating the library sources with run_tapenade_blas.py |
| 130 | + |
| 131 | +This step **generates** the differentiated Fortran sources and test programs (e.g. under `BLAS/src/`, `BLAS/test/`, or a custom output directory like `out/`). |
| 132 | + |
| 133 | +**Dependencies:** |
| 134 | + |
| 135 | +- **LAPACK source tree** — Reference LAPACK (or at least its BLAS SRC directory) so that `--input-dir` points to the BLAS source files (e.g. `lapack-3.x/BLAS/SRC/`). |
| 136 | +- **Tapenade** — installed and on your `PATH` (or pass `--tapenade-bin=/path/to/tapenade`). |
| 137 | +- **Python 3** — to run `run_tapenade_blas.py`. |
| 138 | + |
| 139 | +**Example: generate from a Reference LAPACK tarball** |
| 140 | + |
| 141 | +```bash |
| 142 | +# Download and unpack Reference LAPACK (or set LAPACK_SRC to your tree) |
| 143 | +wget -q -O tmp.html https://raw.githubusercontent.com/Reference-LAPACK/lapack/refs/heads/master/README.md |
| 144 | +VERSION=$(grep VERSION tmp.html | tail -1 | awk '{print $3}') |
| 145 | +rm tmp.html |
| 146 | +wget -P tmp/ https://github.com/Reference-LAPACK/lapack/archive/refs/tags/v${VERSION}.tar.gz |
| 147 | +tar xzf tmp/v${VERSION}.tar.gz -C tmp/ |
| 148 | + |
| 149 | +# Generate differentiated BLAS into BLAS/ (flat layout: src/, test/, include/) |
| 150 | +python run_tapenade_blas.py --input-dir tmp/lapack-${VERSION}/BLAS/SRC/ --out-dir BLAS --flat |
63 | 151 | ``` |
64 | 152 |
|
65 | | -This library generates a differentiated BLAS code using Tapenade in the following modes. |
66 | | -1. Forward mode |
67 | | -2. Vector forward mode |
68 | | -3. Reverse mode |
69 | | -4. Vector reverse mode |
| 153 | +**Options:** Use `--file dgemm.f sgemm.f` to restrict to specific routines; `--mode d b` to generate only certain modes; see `python run_tapenade_blas.py --help`. |
| 154 | + |
| 155 | +## Future work |
| 156 | +There are several routines within BLAS that are not differentiated with verification, which we plan to add. We will add differentiated versions of the CBLAS routines. |
| 157 | +We are beginning to write a differentiated version of the LAPACK library. |
| 158 | + |
| 159 | +## Acknowledgement |
| 160 | +This work was supported in part by the Applied Mathematics activity within the U.S. Department of Energy, Office of Science, Office |
| 161 | +of Advanced Scientific Computing Research Applied Mathematics, and Office of Nuclear Physics SciDAC program under Contract No. DE-AC02-06CH11357. This work was supported in part by NSF CSSI grant 2104068. |
0 commit comments