A toolkit for running FK-eABF (Force-Kernel eABF) enhanced sampling simulations in PLUMED and recovering free energy landscapes from the results.
FK-eABF is an adaptive biasing force method that uses an extended Lagrangian (fictitious particle λ coupled to the real collective variable z) and a kernel-based mean force estimator. The CZAR estimator on the real CV z provides an unbiased free energy gradient that is integrated into a free energy landscape in post-processing.
If you use FK-eABF in your work, please cite:
Kang, C.; Verma, R.; Sonpal, A.; Shoji, A.; Chipot, C.; Pfaendtner, J. A Force-Kernel Reformulation of the Extended-System Adaptive Biasing Force for Free-Energy Calculations. J. Chem. Theory Comput., submitted (2026). DOI: to be added upon acceptance.
- PLUMED (with
forcekernel.cppcompiled as a plugin viaLOAD) - A MD engine supported by PLUMED, or the built-in
pesmdtoy integrator for 2D potentials - Python 3 with NumPy and SciPy for post-processing
The plugin is loaded at runtime via the PLUMED LOAD directive — no recompilation of PLUMED is required. Configure your plumed.dat to load the plugin and define the FKERNELABF action:
LOAD FILE=./forcekernel.cpp
cv: <YOUR_CV_DEFINITION>
fk: FKERNELABF ...
# CV Definition
ARG=cv
# Extended Lagrangian Options
KAPPA=3000.0 # coupling spring constant (kJ/mol/nm^2)
TAU=0.5 # fictitious particle time constant (ps)
FRICTION=8.0 # Langevin friction (ps^-1)
TEMP=300 # temperature (K)
# FK-eABF Options
GRIDMIN=-1.5 # CV domain lower bound
GRIDMAX=1.5 # CV domain upper bound
SIGMA=0.05 # initial kernel width
SIGMA_MIN=0.01 # minimum kernel width
GRIDSIZE=100
# Data Accumulation and Biasing Force Update options
PACE=5 # steps between data accumulation
GRIDPACE=1000 # steps between biasing force updates
# Output options
CZARSTRIDE=50000 # steps between CZAR kernel file writes
KERNELINFOSTRIDE=500 # match this to your PRINT STRIDE
...
PRINT FILE=COLVAR STRIDE=500 ARG=*
If you're setting up a new system, the following decision tree walks through the main parameter choices in roughly the order they should be made:
%%{init: {'flowchart': {'nodeSpacing': 25, 'rankSpacing': 35}}}%%
flowchart TD
Start([Configure FKERNELABF]) --> CV{Periodic CV?}
CV -->|No| CVN[Set GRIDMIN, GRIDMAX]
CV -->|Yes| Sigma
CVN --> Sigma{Bin width<br/>known?}
Sigma -->|Yes| SS[SIGMA = bin width<br/>SIGMA_MIN ≈ SIGMA / 2]
Sigma -->|No| SA[Omit SIGMA, set SIGMA_MIN<br/>adaptive σ₀ warmup]
SS --> Kappa["KAPPA so √(kT/κ) ≪ SIGMA_MIN<br/>typically 1000–5000"]
SA --> Kappa
Kappa --> Engine{Engine?}
Engine -->|Classical| MD[GRIDPACE 500–1000]
Engine -->|AIMD| AIMD[PACE = 1, GRIDPACE 50–200]
MD --> Bias[BIASFACTOR 2–10]
AIMD --> Bias
Bias --> Out[CZARSTRIDE as needed<br/>KERNELINFOSTRIDE = PRINT STRIDE]
Out --> End([Run])
TLDR; set SIGMA, SIGMA_MIN, and GRIDSIZE (plus GRIDMIN/GRIDMAX for non-periodic CVs). Everything else can be left at its default. The three parameters above are technically optional but should be treated as mandatory in practice — getting them right is the difference between a simulation that converges efficiently and one that wastes compute time.
SIGMA and SIGMA_MIN. SIGMA is the initial kernel bandwidth; SIGMA_MIN is the floor below which adaptive Silverman rescaling cannot shrink it. Omitting SIGMA_MIN lets the kernel population grow without bound as the bandwidth contracts, wasting memory and slowing the kernel search. Use the bin width you would adopt for eABF as a guide: set SIGMA to that value and SIGMA_MIN to half of it (e.g., a 5° dihedral bin → SIGMA ≈ 0.087 rad, SIGMA_MIN ≈ 0.04 rad).
GRIDSIZE. The mean-force grid is where NW regression is evaluated and multilinearly interpolated between rebuilds. The grid resolution does not affect kernel accumulation or the recovered free energy — it controls how faithfully the cancellation force is applied between updates. By default (GRIDSIZE=0), FK-eABF auto-sizes the grid so that spacing equals 2 × SIGMA_MIN (the effective kernel diameter), with a floor of 72 points per dimension. An explicit GRIDSIZE producing coarser spacing triggers a warning but does not abort the run.
| Keyword | Default | Description |
|---|---|---|
ARG |
— | Collective variables (1–3 supported). |
KAPPA |
— | Spring constant(s) for z–λ coupling (kJ/mol/unit²). Larger κ → tighter coupling, smaller σ = √(kT/κ). One value or one per CV. |
TAU |
0.5 |
Oscillation period(s) of λ (time units). Sets the fictitious mass m = κτ²/(4π²). |
FRICTION |
10.0 |
Langevin friction on λ (1/time_unit). One value or one per CV. |
TEMP |
300.0 |
Temperature (K). |
PACE |
5 |
Force-sample deposition interval (MD steps). |
THRESH |
1.0 |
Kernel merge threshold in σ-normalised distance. OPES standard; lower → more compression, higher → more kernels. |
NSIGMACUT |
4.0 |
Kernel cutoff in σ per dimension for NW regression. 4.0 gives <2% contribution at the boundary. |
BIASFACTOR |
1.0 |
Exploration factor γ. 1.0 = pure ABF. >1.0 adds density-based exploration on λ via V_ex = c·ln(1 + Z/Z₀) where c = kT(γ−1). The CZAR estimator on z is unaffected. |
EXPLORSCALE |
1.0 |
Per-CV scaling of the exploration force. 0.0 disables exploration on that CV (e.g., 1.0, 0.0 to drive only the first of two CVs). |
MUXCLAMP |
500.0 |
Per-kernel mean-force clamp on absorption (kJ/mol/unit). |
MAXFORCE |
500.0 |
Grid mean-force clamp per node before interpolation (kJ/mol/unit). |
GRIDSIZE |
0 (auto) |
Grid points per dimension. Auto-size: N = ceil(range / (2 × SIGMA_MIN)), floor 72. Defaults to 72 when SIGMA_MIN is unset. |
GRIDPACE |
500 |
Mean-force grid rebuild interval. Reduce for AIMD. |
| Keyword | Default | Description |
|---|---|---|
SIGMA |
(auto) | Initial kernel bandwidth σ₀. Omit entirely for adaptive mode (CV variance measured during an unbiased warmup). |
SIGMA_MIN |
(none) | Bandwidth floor. Set to roughly half SIGMA so free-energy resolution can sharpen with more sampling. |
ADAPTIVE_SIGMA_STRIDE |
10 × PACE |
Length of the unbiased warmup for automatic σ₀ determination. Used only when SIGMA is omitted. |
FIXED_SIGMA |
false |
Disable Silverman rescaling — all kernels use σ₀ permanently. |
| Keyword | Default | Description |
|---|---|---|
GRIDMIN |
(from CV) | Lower grid bound(s) for non-periodic CVs. Reflecting walls applied automatically. |
GRIDMAX |
(from CV) | Upper grid bound(s) for non-periodic CVs. |
| Keyword | Default | Description |
|---|---|---|
NONLIST |
false |
Disable the neighbor list (brute-force kernel search). |
NLIST_PARAMETERS |
3.0 0.5 |
Cutoff factor and skin factor. Includes kernels within cutoff × NSIGMACUT × σ; rebuilds when the query point drifts by skin × dev². |
All filenames are derived from the action label (e.g., fk: FKERNELABF ... → fk.*).
| Keyword | Default | Description |
|---|---|---|
CZARSTRIDE |
(off) | Step-stamped CZAR z-kernel snapshots → {label}.czar_kernels_{step:08d}.dat. Feed to czar_integrate to recover A(z). |
KERNELSTRIDE |
(off) | Step-stamped λ-kernel snapshots → {label}.kernels_{step:08d}.dat. |
LAMBDAGRIDSTRIDE |
(off) | NW mean-force debug grid every N steps → {label}.lambda_grid_{step:08d}.dat. Bias force on the λ grid, not the free energy. |
STATESTRIDE |
CZARSTRIDE, else 10 × GRIDPACE |
Restart state cadence → {label}.state.dat (overwritten in place). State is also written automatically whenever the MD engine writes its own checkpoint. See Restarts below. |
KERNELINFOSTRIDE |
PACE |
Kernel diagnostics line every N steps → {label}.kernelinfo.dat. Set this to match your PRINT STRIDE (e.g. 500); the default of PACE writes at every kernel deposition and adds significant I/O overhead. |
For the included Müller-Brown benchmark, run with PLUMED's built-in 2D toy integrator:
plumed pesmd < pesmd.inThis executes the simulation defined in pesmd.in (10M steps on the 2D Müller-Brown potential) driven by plumed.dat, and writes CZAR kernel snapshots at the configured stride.
FK-eABF writes a complete restart state to {label}.state.dat at STATESTRIDE intervals, and additionally whenever the host MD engine writes its own checkpoint (e.g., a GROMACS .cpt). Coupling to the engine checkpoint keeps the PLUMED state coherent with the trajectory frame; without it, a restart can pick up a state from a slightly different step than the trajectory and the |z − s_fict| diagnostic will flag the mismatch.
The state file contains everything needed to resume bit-for-bit: kernel populations (with stable IDs), σ₀ and adaptive-warmup status, fictitious-particle position and velocity, exploration density Z₀, ID counters, and the full mt19937 RNG state. The mean-force grid itself is not serialized — it is rebuilt from the kernels on restart.
To resume, add RESTART to your plumed.dat (or pass --restart to the MD engine):
RESTART
LOAD FILE=./forcekernel.cpp
fk: FKERNELABF ...
On restart, FK-eABF prints a banner summarising what was loaded (kernel counts, totalN, σ₀, adaptive status, fictitious particle, RNG), reconstructs the mean-force grid from the kernels, and reports rebuild statistics (populated fraction, |F_abf| max). On the first MD step it also logs |z − s_fict| per CV; this should be small relative to √(kT/κ). A large value usually means the trajectory checkpoint and state file are out of sync.
Reliability features. State writes use a write-to-tmp + atomic rename strategy with fsync() for on-disk durability, falling back to direct overwrite if rename fails (common on Lustre/GPFS/NFS). On a successful read, the loaded file is copied to bck.{label}.state.dat.{N} (DRR-style backup-on-load) so the next overwrite cannot clobber a known-good restart point. Dimensionality or temperature mismatches between the state file and the current input abort with an error; a missing state file under RESTART warns and starts from scratch.
Compile czar_integrate.cpp:
g++ -O2 -o czar_integrate czar_integrate.cpp -lm Use the executable to process CZAR kernel files:
./czar_integrate FEL_snapshots -d /path/to/scanThe only required argument is the output directory for PMFs. By default czar_integrate scans the current directory; use -d to point elsewhere. All files matching *czar_kernels_XXXXXXXX.dat are integrated and written as FEL_XXXXXXXX.dat.
For 1D systems, integration uses the trapezoidal rule. For 2D and higher, integration uses an MC random walk (same conventions as abf_integrate). The sigma0 and sigma_min headers in the kernel files enable proper KDE normalization (α_k = ∏ σ₀/σ_k) for variable-bandwidth kernels and automatic grid sizing.
| Flag | Argument | Default | Description |
|---|---|---|---|
-n |
<steps> |
0 |
MC integration steps. 0 = auto-converge on RMSD. |
-h |
<height> |
0.01 |
Initial MC hill height. |
-f |
<factor> |
0.5 |
Hill reduction factor (applied after warmup). |
-t |
<kT> |
(from file) | Override kT (kJ/mol). |
-g |
<pts> |
0 (auto) |
Integration grid points. Auto-sized from sigma_min header (default 100 if absent). |
-s |
<nsigma> |
4.0 |
Kernel cutoff in σ units. |
-m |
<minpop> |
1e-3 |
Minimum density fraction for the allowed region (below → NaN). |
-d |
<dir> |
. |
Directory to scan (batch mode). |
-i |
<file> |
— | Process a single kernel file. |
-o |
<file> |
FEL_czar.dat |
Output filename (single-file mode). |
-v |
— | off | Verbose progress and convergence diagnostics. |
-S |
<step> |
0 |
Skip kernel files before this step. |
# Batch: scan current directory, write FEL snapshots
./czar_integrate FEL_snapshots
# Batch with fixed MC steps and user-specified height
./czar_integrate FEL_snapshots -n 5000000 -h 0.2
# Skip files before step 5M
./czar_integrate FEL_snapshots -n 5000000 -h 0.2 -S 5000000
# Scan a different directory
./czar_integrate FEL_snapshots -d /path/to/run
# Single file
./czar_integrate -i fk.czar_kernels_10000000.dat -o PMF.dat
# Fine grid, verbose
./czar_integrate FEL_snapshots -g 150 -vSingle-file mode (-i): space-separated columns z0, z1, …, czar_grad0, czar_grad1, …, ptilde, A_czar, where ptilde is the biased density (NW denominator) and A_czar is the free energy in kJ/mol shifted to zero at the minimum.
Batch mode (default): a simpler format with columns z0, z1, …, A.
In both modes, points below the population threshold are written as nan. For 2D+ grids, blank lines separate slices along the first dimension (gnuplot pm3d compatible).
fkabf_diagnostics.py processes the COLVAR and {label}.kernelinfo.dat files in the current directory to produce summary plots:
python fkabf_diagnostics.py| Flag | Argument | Default | Description |
|---|---|---|---|
--colvar |
<file> |
COLVAR |
PLUMED COLVAR file. |
--kernelinfo |
<file> |
(auto) | {label}.kernelinfo.dat. Skipped if absent. |
--prefix |
<label> |
(auto) | Action label prefix. Auto-detected from _fict columns. |
--dt |
<float> |
0.001 |
MD timestep (for converting time → steps). |
--thinning |
<int> |
10 |
Plot every Nth point in scatter / trajectory plots. |
--periodic |
<spec> |
(none) | Periodic CV spec for minimum-image z−λ. Format: "cv1:min:max,cv2:min:max" or "cv1:period". Supports pi. |
--outdir |
<dir> |
. |
Figure output directory. |
# Auto-detect everything in current directory
python fkabf_diagnostics.py
# Specify files and output location
python fkabf_diagnostics.py --colvar COLVAR --kernelinfo fk.kernelinfo.dat --outdir plots/
# Alanine dipeptide with periodic CVs
python fkabf_diagnostics.py --periodic "phi:-pi:pi,psi:-pi:pi" --dt 0.002
# Dense trajectory, less thinning
python fkabf_diagnostics.py --thinning 2| File | Contents |
|---|---|
fig_trajectory.pdf |
Per-CV: z and λ time series, z−λ over time, and z−λ histogram (minimum-image for periodic CVs). |
fig_bias.pdf |
|F_bias| and V_ex over time. |
fig_kernels.pdf |
Kernel counts M and M_z, n_eff, compression N/M, Silverman σ per CV, Z₀ and Z(λ) if present. |
fig_exploration.pdf |
2D scatter of z and λ trajectories side-by-side, colored by time (2+ CVs only). |
fig_phase.pdf |
z vs λ scatter per CV, colored by time. Spread indicates coupling width √(kT/κ). |
fig_nlist.pdf |
Neighbor list size and nlker/M fraction over time. |
A text summary (CV ranges, z−λ standard deviation, kernel counts, compression ratio, convergence metrics) is printed to stdout before figure generation.
FK-eABF is designed to converge quickly, but fast convergence does not absolve the practitioner of proving that convergence has actually been achieved. A free-energy surface that looks reasonable is not the same as one that is correct. The following checks should be treated as mandatory.
Verify extended-system synchronization. All extended-system ABF methods rely on λ remaining well coupled to z; if they desynchronize, CZAR receives corrupted force samples. Confirm that the z − λ distribution is centered at zero with a width consistent with σ ≈ √(kT/κ) — fkabf_diagnostics.py produces this histogram automatically (fig_phase.pdf, fig_trajectory.pdf). A bimodal, skewed, or excessively broad distribution means κ or τ should be adjusted before trusting the result.
Run multiple independent replicas. A single trajectory that appears converged may have settled into a local minimum of the estimator without sampling all relevant basins. Run at least two — preferably three — independent replicas from different initial conditions and compare the resulting FELs. Agreement between replicas, not internal smoothness of a single run, is the minimum standard for convergence.
Cross-method validation. Self-consistency within a single method is necessary but not sufficient: simulations can satisfy every standard self-convergence criterion while producing quantitatively incorrect free-energy profiles. For at least one system in any study, run a parallel calculation with an independent method (OPES, WTM-eABF, REUS) and compare. Cross-method agreement is the only reliable criterion currently available for validating free-energy calculations on systems where the true answer is unknown.