|
| 1 | +# Extrapolation grade and active learning |
| 2 | + |
| 3 | +For any fitted ACE potential and corresponding training set |
| 4 | +(usually stored by `pacemaker` into `fitting_data_info.pckl.gzip` file in working directory) |
| 5 | +one can generate corresponding active set for linear B-projections (default) of full non-linear embedding. |
| 6 | +Practice shows that linear active set is enough for extrapolation grade estimation. |
| 7 | +However, if you want more sensitive (and "over-secure") extrapolation grade, then full active set could be used. |
| 8 | + |
| 9 | + |
| 10 | + |
| 11 | +## Active set generation |
| 12 | + |
| 13 | +Utility to generate active set (used for extrapolation grade calculation). |
| 14 | + |
| 15 | +``` |
| 16 | +usage: pace_activeset [-h] [-d DATASET] [-f] [-b BATCH_SIZE] [-g GAMMA_TOLERANCE] [-i MAXVOL_ITERS] [-r MAXVOL_REFINEMENT] [-m MEMORY_LIMIT] potential_file |
| 17 | +
|
| 18 | +Utility to compute active set for PACE (.yaml) potential |
| 19 | +
|
| 20 | +positional arguments: |
| 21 | +potential_file B-basis file name (.yaml) |
| 22 | +
|
| 23 | +optional arguments: |
| 24 | + -h, --help show this help message and exit |
| 25 | + -d DATASET, --dataset DATASET |
| 26 | + Dataset file name, ex.: filename.pckl.gzip |
| 27 | + -f, --full Compute active set on full (linearized) design matrix |
| 28 | + -b BATCH_SIZE, --batch_size BATCH_SIZE |
| 29 | + Batch size (number of structures) considered simultaneously.If not provided - all dataset at once is considered |
| 30 | + -g GAMMA_TOLERANCE, --gamma_tolerance GAMMA_TOLERANCE |
| 31 | + Gamma tolerance |
| 32 | + -i MAXVOL_ITERS, --maxvol_iters MAXVOL_ITERS |
| 33 | + Number of maximum iteration in MaxVol algorithm |
| 34 | + -r MAXVOL_REFINEMENT, --maxvol_refinement MAXVOL_REFINEMENT |
| 35 | + Number of refinements (epochs) |
| 36 | + -m MEMORY_LIMIT, --memory-limit MEMORY_LIMIT |
| 37 | + Memory limit (i.e. 1GB, 500MB or 'auto') |
| 38 | +``` |
| 39 | + |
| 40 | +Example of usage: |
| 41 | + |
| 42 | +``` |
| 43 | +pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml |
| 44 | +``` |
| 45 | +that will generate **linear** active set and store it into `output_potential.asi` file. |
| 46 | + |
| 47 | +or |
| 48 | + |
| 49 | +``` |
| 50 | +pace_activeset -d fitting_data_info.pckl.gzip output_potential.yaml -f |
| 51 | +``` |
| 52 | +that will generate **full** active set (including linearized part of non-linear embedding function) |
| 53 | +and store it into `output_potential.asi.nonlinear` file. |
| 54 | + |
| 55 | +## Usage of active set with LAMMPS |
| 56 | + |
| 57 | +Example of usage of active set with LAMMPS |
| 58 | +``` |
| 59 | +pair_style pace/extrapolation |
| 60 | +pair_coeff * * output_potential.yaml output_potential.asi Al Cu |
| 61 | +
|
| 62 | +# compute per-atom extrapolation grade every 10 steps |
| 63 | +fix pace_gamma all pair 10 pace/extrapolation gamma 1 |
| 64 | +# compute maximum extrapolation grade over complete structure |
| 65 | +compute max_pace_gamma all reduce max f_pace_gamma |
| 66 | +
|
| 67 | +# dump extrapolative structures if c_max_pace_gamma > 5, skip otherwise, check every 20 steps |
| 68 | +variable dump_skip equal "c_max_pace_gamma < 5" |
| 69 | +dump pace_dump all custom 20 extrapolative_structures.dump id type x y z f_pace_gamma |
| 70 | +dump_modify pace_dump skip v_dump_skip |
| 71 | +
|
| 72 | +# stop simulation if maximum extrapolation grade exceeds 25 |
| 73 | +variable max_pace_gamma equal c_max_pace_gamma |
| 74 | +fix extreme_extrapolation all halt 10 v_max_pace_gamma > 25 |
| 75 | +``` |
| 76 | + |
| 77 | +Check [LAMMPS documentation](https://docs.lammps.org/latest/pair_pace.html) for more details and example |
| 78 | + |
| 79 | +With this setup you can run LAMMPS simulations and make use of per-atom extrapolation grade `f_pace_gamma` fix variable |
| 80 | +(i.e. in regular dump and visualization) or per-structure `c_max_pace_gamma` maximum extrapolation grade in thermo_style. |
| 81 | + |
| 82 | +Two main scenarios: |
| 83 | +1. Exploring new structures (and dump extrapolative structures with `dump pace_dump`). |
| 84 | +In that case extrapolative structures will be stored into `extrapolative_structures.dump` file, that could be loaded |
| 85 | +(i.e. with ASE) and DFT calculations could be performed with the tools of your choice. |
| 86 | +2. Performing normal simulations, observing extrapolation grade (printing `c_max_pace_gamma` variable) |
| 87 | +and stopping at extreme_extrapolation (with `fix halt`) |
| 88 | + |
| 89 | +## Usage of active set with ASE calculator |
| 90 | + |
| 91 | +```python |
| 92 | +from pyace import * |
| 93 | + |
| 94 | +calc = PyACECalculator("output_potential.yaml") |
| 95 | +calc.set_active_set("output_potential.asi") |
| 96 | + |
| 97 | +# set calculator to ASE atoms |
| 98 | +atoms.set_calculator(calc) |
| 99 | + |
| 100 | +# trigger calculation |
| 101 | +atoms.get_potential_energy() |
| 102 | + |
| 103 | +#per-atom extrapolation grades are stored in |
| 104 | +calc.results["gamma"] |
| 105 | +``` |
0 commit comments