Skip to content

Commit 76bd78d

Browse files
author
Yury Lysogorskiy
committed
add active_learning.md, update utilities.md, update mkdocs.yml
1 parent b730248 commit 76bd78d

4 files changed

Lines changed: 150 additions & 1 deletion

File tree

docs/pacemaker/active_learning.md

Lines changed: 105 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,105 @@
1+
# Extrapolation grade and active learning
2+
3+
For any fitted ACE potential and corresponding training set
4+
(usually stored by `pacemaker` into `fitting_data_info.pckl.gzip` file in working directory)
5+
one can generate corresponding active set for linear B-projections (default) of full non-linear embedding.
6+
Practice shows that linear active set is enough for extrapolation grade estimation.
7+
However, if you want more sensitive (and "over-secure") extrapolation grade, then full active set could be used.
8+
9+
10+
11+
## Active set generation
12+
13+
Utility to generate active set (used for extrapolation grade calculation).
14+
15+
```
16+
usage: pace_activeset [-h] [-d DATASET] [-f] [-b BATCH_SIZE] [-g GAMMA_TOLERANCE] [-i MAXVOL_ITERS] [-r MAXVOL_REFINEMENT] [-m MEMORY_LIMIT] potential_file
17+
18+
Utility to compute active set for PACE (.yaml) potential
19+
20+
positional arguments:
21+
potential_file B-basis file name (.yaml)
22+
23+
optional arguments:
24+
-h, --help show this help message and exit
25+
-d DATASET, --dataset DATASET
26+
Dataset file name, ex.: filename.pckl.gzip
27+
-f, --full Compute active set on full (linearized) design matrix
28+
-b BATCH_SIZE, --batch_size BATCH_SIZE
29+
Batch size (number of structures) considered simultaneously.If not provided - all dataset at once is considered
30+
-g GAMMA_TOLERANCE, --gamma_tolerance GAMMA_TOLERANCE
31+
Gamma tolerance
32+
-i MAXVOL_ITERS, --maxvol_iters MAXVOL_ITERS
33+
Number of maximum iteration in MaxVol algorithm
34+
-r MAXVOL_REFINEMENT, --maxvol_refinement MAXVOL_REFINEMENT
35+
Number of refinements (epochs)
36+
-m MEMORY_LIMIT, --memory-limit MEMORY_LIMIT
37+
Memory limit (i.e. 1GB, 500MB or 'auto')
38+
```
39+
40+
Example of usage:
41+
42+
```
43+
pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml
44+
```
45+
that will generate **linear** active set and store it into `output_potential.asi` file.
46+
47+
or
48+
49+
```
50+
pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml -f
51+
```
52+
that will generate **full** active set (including linearized part of non-linear embedding function)
53+
and store it into `output_potential.asi.nonlinear` file.
54+
55+
## Usage of active set with LAMMPS
56+
57+
Example of usage of active set with LAMMPS
58+
```
59+
pair_style pace/extrapolation
60+
pair_coeff * * output_potential.yaml output_potential.asi Al Cu
61+
62+
# compute per-atom extrapolation grade every 10 steps
63+
fix pace_gamma all pair 10 pace/extrapolation gamma 1
64+
# compute maximum extrapolation grade over complete structure
65+
compute max_pace_gamma all reduce max f_pace_gamma
66+
67+
# dump extrapolative structures if c_max_pace_gamma > 5, skip otherwise, check every 20 steps
68+
variable dump_skip equal "c_max_pace_gamma < 5"
69+
dump pace_dump all custom 20 extrapolative_structures.dump id type x y z f_pace_gamma
70+
dump_modify pace_dump skip v_dump_skip
71+
72+
# stop simulation if maximum extrapolation grade exceeds 25
73+
variable max_pace_gamma equal c_max_pace_gamma
74+
fix extreme_extrapolation all halt 10 v_max_pace_gamma > 25
75+
```
76+
77+
Check [LAMMPS documentation](https://docs.lammps.org/latest/pair_pace.html) for more details and example
78+
79+
With this setup you can run LAMMPS simulations and make use of per-atom extrapolation grade `f_pace_gamma` fix variable
80+
(i.e. in regular dump and visualization) or per-structure `c_max_pace_gamma` maximum extrapolation grade in thermo_style.
81+
82+
Two main scenarios:
83+
1. Exploring new structures (and dump extrapolative structures with `dump pace_dump`).
84+
In that case extrapolative structures will be stored into `extrapolative_structures.dump` file, that could be loaded
85+
(i.e. with ASE) and DFT calculations could be performed with the tools of your choice.
86+
2. Performing normal simulations, observing extrapolation grade (printing `c_max_pace_gamma` variable)
87+
and stopping at extreme_extrapolation (with `fix halt`)
88+
89+
## Usage of active set with ASE calculator
90+
91+
```python
92+
from pyace import *
93+
94+
calc = PyACECalculator("output_potential.yaml")
95+
calc.set_active_set("output_potential.asi")
96+
97+
# set calculator to ASE atoms
98+
atoms.set_calculator(calc)
99+
100+
# trigger calculation
101+
atoms.get_potential_energy()
102+
103+
#per-atom extrapolation grades are stored in
104+
calc.results["gamma"]
105+
```

docs/pacemaker/quickstart.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -58,7 +58,7 @@ And it contains the following entries:
5858

5959
- Columns have the following meaning:
6060
- `ase_atoms`: is the instance of the [ASE](https://wiki.fysik.dtu.dk/ase/) Atoms class. This is the main form of storing structural information
61-
that `pacemkaer` relies on. It must contain information about atomic positions, corresponding atom types, pbc and lattice vectors.
61+
that `pacemaker` relies on. It must contain information about atomic positions, corresponding atom types, pbc and lattice vectors.
6262
- `energy`: total energy of the corresponding `ase_atoms` structure (in eV).
6363
- `forces`: corresponding atomic forces in the form of 2D array with dimensions [NumberOfAtoms, 3] (in eV/A).
6464
- `energy_corrected`: total energy of a structure minus a reference energy.

docs/pacemaker/utilities.md

Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,3 +62,46 @@ optional arguments:
6262
--free-atom-energy [FREE_ATOM_ENERGY [FREE_ATOM_ENERGY ...]]
6363
dictionary of reference energies (i.e. Al:-0.123 Cu:-0.456 Zn:-0.789)
6464
```
65+
## Active set generation
66+
67+
Utility to generate active set (used for extrapolation grade calculation).
68+
69+
```
70+
usage: pace_activeset [-h] [-d DATASET] [-f] [-b BATCH_SIZE] [-g GAMMA_TOLERANCE] [-i MAXVOL_ITERS] [-r MAXVOL_REFINEMENT] [-m MEMORY_LIMIT] potential_file
71+
72+
Utility to compute active set for PACE (.yaml) potential
73+
74+
positional arguments:
75+
potential_file B-basis file name (.yaml)
76+
77+
optional arguments:
78+
-h, --help show this help message and exit
79+
-d DATASET, --dataset DATASET
80+
Dataset file name, ex.: filename.pckl.gzip
81+
-f, --full Compute active set on full (linearized) design matrix
82+
-b BATCH_SIZE, --batch_size BATCH_SIZE
83+
Batch size (number of structures) considered simultaneously.If not provided - all dataset at once is considered
84+
-g GAMMA_TOLERANCE, --gamma_tolerance GAMMA_TOLERANCE
85+
Gamma tolerance
86+
-i MAXVOL_ITERS, --maxvol_iters MAXVOL_ITERS
87+
Number of maximum iteration in MaxVol algorithm
88+
-r MAXVOL_REFINEMENT, --maxvol_refinement MAXVOL_REFINEMENT
89+
Number of refinements (epochs)
90+
-m MEMORY_LIMIT, --memory-limit MEMORY_LIMIT
91+
Memory limit (i.e. 1GB, 500MB or 'auto')
92+
```
93+
94+
Example of usage:
95+
96+
```
97+
pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml
98+
```
99+
that will generate **linear** active set and store it into `output_potential.asi` file.
100+
101+
or
102+
103+
```
104+
pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml -f
105+
```
106+
that will generate **full** active set (including linearized part of non-linear embedding function)
107+
and store it into `output_potential.asi.nonlinear` file.

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ nav:
2222
- Installation: pacemaker/install.md
2323
- Quick start: pacemaker/quickstart.md
2424
- Pacemaker workflow: pacemaker/workflow.md
25+
- Extrapolation grade and active learning: pacemaker/active_learning.md
2526
- Input file: pacemaker/inputfile.md
2627
- CLI: pacemaker/cli.md
2728
- Utilities: pacemaker/utilities.md

0 commit comments

Comments
 (0)