FK-eABF

A toolkit for running FK-eABF (Force-Kernel eABF) enhanced sampling simulations in PLUMED and recovering free energy landscapes from the results.

FK-eABF is an adaptive biasing force method that uses an extended Lagrangian (fictitious particle λ coupled to the real collective variable z) and a kernel-based mean force estimator. The CZAR estimator on the real CV z provides an unbiased free energy gradient that is integrated into a free energy landscape in post-processing.

Citation

If you use FK-eABF in your work, please cite:

Kang, C.; Verma, R.; Sonpal, A.; Shoji, A.; Chipot, C.; Pfaendtner, J. A Force-Kernel Reformulation of the Extended-System Adaptive Biasing Force for Free-Energy Calculations. J. Chem. Theory Comput., submitted (2026). DOI: to be added upon acceptance.

Requirements

PLUMED (with forcekernel.cpp compiled as a plugin via LOAD)
A MD engine supported by PLUMED, or the built-in pesmd toy integrator for 2D potentials
Python 3 with NumPy and SciPy for post-processing

Workflow

1. Setting up an FK-eABF simulation

The plugin is loaded at runtime via the PLUMED LOAD directive — no recompilation of PLUMED is required. Configure your plumed.dat to load the plugin and define the FKERNELABF action:

LOAD FILE=./forcekernel.cpp

cv: <YOUR_CV_DEFINITION>

fk: FKERNELABF ...
    
    # CV Definition
    ARG=cv

    # Extended Lagrangian Options
    KAPPA=3000.0          # coupling spring constant (kJ/mol/nm^2)
    TAU=0.5               # fictitious particle time constant (ps)
    FRICTION=8.0          # Langevin friction (ps^-1)
    TEMP=300              # temperature (K)

    # FK-eABF Options
    GRIDMIN=-1.5          # CV domain lower bound
    GRIDMAX=1.5           # CV domain upper bound
    SIGMA=0.05            # initial kernel width
    SIGMA_MIN=0.01        # minimum kernel width
    GRIDSIZE=100

    # Data Accumulation and Biasing Force Update options
    PACE=5                # steps between data accumulation
    GRIDPACE=1000         # steps between biasing force updates

    # Output options
    CZARSTRIDE=50000      # steps between CZAR kernel file writes
    KERNELINFOSTRIDE=500  # match this to your PRINT STRIDE
...

PRINT FILE=COLVAR STRIDE=500 ARG=*

Parameter selection at a glance

If you're setting up a new system, the following decision tree walks through the main parameter choices in roughly the order they should be made:

%%{init: {'flowchart': {'nodeSpacing': 25, 'rankSpacing': 35}}}%%
flowchart TD
    Start([Configure FKERNELABF]) --> CV{Periodic CV?}
    CV -->|No| CVN[Set GRIDMIN, GRIDMAX]
    CV -->|Yes| Sigma
    CVN --> Sigma{Bin width<br/>known?}

    Sigma -->|Yes| SS[SIGMA = bin width<br/>SIGMA_MIN ≈ SIGMA / 2]
    Sigma -->|No| SA[Omit SIGMA, set SIGMA_MIN<br/>adaptive σ₀ warmup]

    SS --> Kappa["KAPPA so √(kT/κ) ≪ SIGMA_MIN<br/>typically 1000–5000"]
    SA --> Kappa

    Kappa --> Engine{Engine?}
    Engine -->|Classical| MD[GRIDPACE 500–1000]
    Engine -->|AIMD| AIMD[PACE = 1, GRIDPACE 50–200]

    MD --> Bias[BIASFACTOR 2–10]
    AIMD --> Bias
    Bias --> Out[CZARSTRIDE as needed<br/>KERNELINFOSTRIDE = PRINT STRIDE]
    Out --> End([Run])

TLDR; set SIGMA, SIGMA_MIN, and GRIDSIZE (plus GRIDMIN/GRIDMAX for non-periodic CVs). Everything else can be left at its default. The three parameters above are technically optional but should be treated as mandatory in practice — getting them right is the difference between a simulation that converges efficiently and one that wastes compute time.

SIGMA and SIGMA_MIN. SIGMA is the initial kernel bandwidth; SIGMA_MIN is the floor below which adaptive Silverman rescaling cannot shrink it. Omitting SIGMA_MIN lets the kernel population grow without bound as the bandwidth contracts, wasting memory and slowing the kernel search. Use the bin width you would adopt for eABF as a guide: set SIGMA to that value and SIGMA_MIN to half of it (e.g., a 5° dihedral bin → SIGMA ≈ 0.087 rad, SIGMA_MIN ≈ 0.04 rad).

GRIDSIZE. The mean-force grid is where NW regression is evaluated and multilinearly interpolated between rebuilds. The grid resolution does not affect kernel accumulation or the recovered free energy — it controls how faithfully the cancellation force is applied between updates. By default (GRIDSIZE=0), FK-eABF auto-sizes the grid so that spacing equals 2 × SIGMA_MIN (the effective kernel diameter), with a floor of 72 points per dimension. An explicit GRIDSIZE producing coarser spacing triggers a warning but does not abort the run.

Compulsory Keywords

Keyword	Default	Description
`ARG`	—	Collective variables (1–3 supported).
`KAPPA`	—	Spring constant(s) for z–λ coupling (kJ/mol/unit²). Larger κ → tighter coupling, smaller σ = √(kT/κ). One value or one per CV.
`TAU`	`0.5`	Oscillation period(s) of λ (time units). Sets the fictitious mass m = κτ²/(4π²).
`FRICTION`	`10.0`	Langevin friction on λ (1/time_unit). One value or one per CV.
`TEMP`	`300.0`	Temperature (K).
`PACE`	`5`	Force-sample deposition interval (MD steps).
`THRESH`	`1.0`	Kernel merge threshold in σ-normalised distance. OPES standard; lower → more compression, higher → more kernels.
`NSIGMACUT`	`4.0`	Kernel cutoff in σ per dimension for NW regression. 4.0 gives <2% contribution at the boundary.
`BIASFACTOR`	`1.0`	Exploration factor γ. `1.0` = pure ABF. `>1.0` adds density-based exploration on λ via V_ex = c·ln(1 + Z/Z₀) where c = kT(γ−1). The CZAR estimator on z is unaffected.
`EXPLORSCALE`	`1.0`	Per-CV scaling of the exploration force. `0.0` disables exploration on that CV (e.g., `1.0, 0.0` to drive only the first of two CVs).
`MUXCLAMP`	`500.0`	Per-kernel mean-force clamp on absorption (kJ/mol/unit).
`MAXFORCE`	`500.0`	Grid mean-force clamp per node before interpolation (kJ/mol/unit).
`GRIDSIZE`	`0` (auto)	Grid points per dimension. Auto-size: N = ceil(range / (2 × SIGMA_MIN)), floor 72. Defaults to 72 when `SIGMA_MIN` is unset.
`GRIDPACE`	`500`	Mean-force grid rebuild interval. Reduce for AIMD.

Optional Keywords — Bandwidth

Keyword	Default	Description
`SIGMA`	(auto)	Initial kernel bandwidth σ₀. Omit entirely for adaptive mode (CV variance measured during an unbiased warmup).
`SIGMA_MIN`	(none)	Bandwidth floor. Set to roughly half `SIGMA` so free-energy resolution can sharpen with more sampling.
`ADAPTIVE_SIGMA_STRIDE`	`10 × PACE`	Length of the unbiased warmup for automatic σ₀ determination. Used only when `SIGMA` is omitted.
`FIXED_SIGMA`	`false`	Disable Silverman rescaling — all kernels use σ₀ permanently.

Optional Keywords — Grid Bounds

Keyword	Default	Description
`GRIDMIN`	(from CV)	Lower grid bound(s) for non-periodic CVs. Reflecting walls applied automatically.
`GRIDMAX`	(from CV)	Upper grid bound(s) for non-periodic CVs.

Optional Keywords — Neighbor List

Keyword	Default	Description
`NONLIST`	`false`	Disable the neighbor list (brute-force kernel search).
`NLIST_PARAMETERS`	`3.0 0.5`	Cutoff factor and skin factor. Includes kernels within `cutoff × NSIGMACUT × σ`; rebuilds when the query point drifts by `skin × dev²`.

Optional Keywords — Output Files

All filenames are derived from the action label (e.g., fk: FKERNELABF ... → fk.*).

Keyword	Default	Description
`CZARSTRIDE`	(off)	Step-stamped CZAR z-kernel snapshots → `{label}.czar_kernels_{step:08d}.dat`. Feed to `czar_integrate` to recover A(z).
`KERNELSTRIDE`	(off)	Step-stamped λ-kernel snapshots → `{label}.kernels_{step:08d}.dat`.
`LAMBDAGRIDSTRIDE`	(off)	NW mean-force debug grid every N steps → `{label}.lambda_grid_{step:08d}.dat`. Bias force on the λ grid, not the free energy.
`STATESTRIDE`	`CZARSTRIDE`, else `10 × GRIDPACE`	Restart state cadence → `{label}.state.dat` (overwritten in place). State is also written automatically whenever the MD engine writes its own checkpoint. See Restarts below.
`KERNELINFOSTRIDE`	`PACE`	Kernel diagnostics line every N steps → `{label}.kernelinfo.dat`. Set this to match your `PRINT STRIDE` (e.g. 500); the default of `PACE` writes at every kernel deposition and adds significant I/O overhead.

2. Running the simulation

For the included Müller-Brown benchmark, run with PLUMED's built-in 2D toy integrator:

plumed pesmd < pesmd.in

This executes the simulation defined in pesmd.in (10M steps on the 2D Müller-Brown potential) driven by plumed.dat, and writes CZAR kernel snapshots at the configured stride.

Restarts

FK-eABF writes a complete restart state to {label}.state.dat at STATESTRIDE intervals, and additionally whenever the host MD engine writes its own checkpoint (e.g., a GROMACS .cpt). Coupling to the engine checkpoint keeps the PLUMED state coherent with the trajectory frame; without it, a restart can pick up a state from a slightly different step than the trajectory and the |z − s_fict| diagnostic will flag the mismatch.

The state file contains everything needed to resume bit-for-bit: kernel populations (with stable IDs), σ₀ and adaptive-warmup status, fictitious-particle position and velocity, exploration density Z₀, ID counters, and the full mt19937 RNG state. The mean-force grid itself is not serialized — it is rebuilt from the kernels on restart.

To resume, add RESTART to your plumed.dat (or pass --restart to the MD engine):

RESTART
LOAD FILE=./forcekernel.cpp
fk: FKERNELABF ...

On restart, FK-eABF prints a banner summarising what was loaded (kernel counts, totalN, σ₀, adaptive status, fictitious particle, RNG), reconstructs the mean-force grid from the kernels, and reports rebuild statistics (populated fraction, |F_abf| max). On the first MD step it also logs |z − s_fict| per CV; this should be small relative to √(kT/κ). A large value usually means the trajectory checkpoint and state file are out of sync.

Reliability features. State writes use a write-to-tmp + atomic rename strategy with fsync() for on-disk durability, falling back to direct overwrite if rename fails (common on Lustre/GPFS/NFS). On a successful read, the loaded file is copied to bck.{label}.state.dat.{N} (DRR-style backup-on-load) so the next overwrite cannot clobber a known-good restart point. Dimensionality or temperature mismatches between the state file and the current input abort with an error; a missing state file under RESTART warns and starts from scratch.

3. Recover the free energy landscape

Compile czar_integrate.cpp:

g++ -O2 -o czar_integrate czar_integrate.cpp -lm

Use the executable to process CZAR kernel files:

./czar_integrate FEL_snapshots -d /path/to/scan

The only required argument is the output directory for PMFs. By default czar_integrate scans the current directory; use -d to point elsewhere. All files matching *czar_kernels_XXXXXXXX.dat are integrated and written as FEL_XXXXXXXX.dat.

For 1D systems, integration uses the trapezoidal rule. For 2D and higher, integration uses an MC random walk (same conventions as abf_integrate). The sigma0 and sigma_min headers in the kernel files enable proper KDE normalization (α_k = ∏ σ₀/σ_k) for variable-bandwidth kernels and automatic grid sizing.

Options

Flag	Argument	Default	Description
`-n`	`<steps>`	`0`	MC integration steps. `0` = auto-converge on RMSD.
`-h`	`<height>`	`0.01`	Initial MC hill height.
`-f`	`<factor>`	`0.5`	Hill reduction factor (applied after warmup).
`-t`	`<kT>`	(from file)	Override kT (kJ/mol).
`-g`	`<pts>`	`0` (auto)	Integration grid points. Auto-sized from `sigma_min` header (default 100 if absent).
`-s`	`<nsigma>`	`4.0`	Kernel cutoff in σ units.
`-m`	`<minpop>`	`1e-3`	Minimum density fraction for the allowed region (below → NaN).
`-d`	`<dir>`	`.`	Directory to scan (batch mode).
`-i`	`<file>`	—	Process a single kernel file.
`-o`	`<file>`	`FEL_czar.dat`	Output filename (single-file mode).
`-v`	—	off	Verbose progress and convergence diagnostics.
`-S`	`<step>`	`0`	Skip kernel files before this step.

Examples

# Batch: scan current directory, write FEL snapshots
./czar_integrate FEL_snapshots

# Batch with fixed MC steps and user-specified height
./czar_integrate FEL_snapshots -n 5000000 -h 0.2

# Skip files before step 5M
./czar_integrate FEL_snapshots -n 5000000 -h 0.2 -S 5000000

# Scan a different directory
./czar_integrate FEL_snapshots -d /path/to/run

# Single file
./czar_integrate -i fk.czar_kernels_10000000.dat -o PMF.dat

# Fine grid, verbose
./czar_integrate FEL_snapshots -g 150 -v

Output Format

Single-file mode (-i): space-separated columns z0, z1, …, czar_grad0, czar_grad1, …, ptilde, A_czar, where ptilde is the biased density (NW denominator) and A_czar is the free energy in kJ/mol shifted to zero at the minimum.

Batch mode (default): a simpler format with columns z0, z1, …, A.

In both modes, points below the population threshold are written as nan. For 2D+ grids, blank lines separate slices along the first dimension (gnuplot pm3d compatible).

4. Additional diagnostics

fkabf_diagnostics.py processes the COLVAR and {label}.kernelinfo.dat files in the current directory to produce summary plots:

python fkabf_diagnostics.py

Options

Flag	Argument	Default	Description
`--colvar`	`<file>`	`COLVAR`	PLUMED COLVAR file.
`--kernelinfo`	`<file>`	(auto)	`{label}.kernelinfo.dat`. Skipped if absent.
`--prefix`	`<label>`	(auto)	Action label prefix. Auto-detected from `_fict` columns.
`--dt`	`<float>`	`0.001`	MD timestep (for converting time → steps).
`--thinning`	`<int>`	`10`	Plot every Nth point in scatter / trajectory plots.
`--periodic`	`<spec>`	(none)	Periodic CV spec for minimum-image z−λ. Format: `"cv1:min:max,cv2:min:max"` or `"cv1:period"`. Supports `pi`.
`--outdir`	`<dir>`	`.`	Figure output directory.

Examples

# Auto-detect everything in current directory
python fkabf_diagnostics.py

# Specify files and output location
python fkabf_diagnostics.py --colvar COLVAR --kernelinfo fk.kernelinfo.dat --outdir plots/

# Alanine dipeptide with periodic CVs
python fkabf_diagnostics.py --periodic "phi:-pi:pi,psi:-pi:pi" --dt 0.002

# Dense trajectory, less thinning
python fkabf_diagnostics.py --thinning 2

Output Figures

File	Contents
`fig_trajectory.pdf`	Per-CV: z and λ time series, z−λ over time, and z−λ histogram (minimum-image for periodic CVs).
`fig_bias.pdf`	\|F_bias\| and V_ex over time.
`fig_kernels.pdf`	Kernel counts M and M_z, n_eff, compression N/M, Silverman σ per CV, Z₀ and Z(λ) if present.
`fig_exploration.pdf`	2D scatter of z and λ trajectories side-by-side, colored by time (2+ CVs only).
`fig_phase.pdf`	z vs λ scatter per CV, colored by time. Spread indicates coupling width √(kT/κ).
`fig_nlist.pdf`	Neighbor list size and nlker/M fraction over time.

A text summary (CV ranges, z−λ standard deviation, kernel counts, compression ratio, convergence metrics) is printed to stdout before figure generation.

5. Validating your results

FK-eABF is designed to converge quickly, but fast convergence does not absolve the practitioner of proving that convergence has actually been achieved. A free-energy surface that looks reasonable is not the same as one that is correct. The following checks should be treated as mandatory.

Verify extended-system synchronization. All extended-system ABF methods rely on λ remaining well coupled to z; if they desynchronize, CZAR receives corrupted force samples. Confirm that the z − λ distribution is centered at zero with a width consistent with σ ≈ √(kT/κ) — fkabf_diagnostics.py produces this histogram automatically (fig_phase.pdf, fig_trajectory.pdf). A bimodal, skewed, or excessively broad distribution means κ or τ should be adjusted before trusting the result.

Run multiple independent replicas. A single trajectory that appears converged may have settled into a local minimum of the estimator without sampling all relevant basins. Run at least two — preferably three — independent replicas from different initial conditions and compare the resulting FELs. Agreement between replicas, not internal smoothness of a single run, is the minimum standard for convergence.

Cross-method validation. Self-consistency within a single method is necessary but not sufficient: simulations can satisfy every standard self-convergence criterion while producing quantitatively incorrect free-energy profiles. For at least one system in any study, run a parallel calculation with an independent method (OPES, WTM-eABF, REUS) and compare. Cross-method agreement is the only reliable criterion currently available for validating free-energy calculations on systems where the true answer is unknown.

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
analysis		analysis
examples		examples
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FK-eABF

Citation

Requirements

Workflow

1. Setting up an FK-eABF simulation

Parameter selection at a glance

Compulsory Keywords

Optional Keywords — Bandwidth

Optional Keywords — Grid Bounds

Optional Keywords — Neighbor List

Optional Keywords — Output Files

2. Running the simulation

Restarts

3. Recover the free energy landscape

Options

Examples

Output Format

4. Additional diagnostics

Options

Examples

Output Figures

5. Validating your results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FK-eABF

Citation

Requirements

Workflow

1. Setting up an FK-eABF simulation

Parameter selection at a glance

Compulsory Keywords

Optional Keywords — Bandwidth

Optional Keywords — Grid Bounds

Optional Keywords — Neighbor List

Optional Keywords — Output Files

2. Running the simulation

Restarts

3. Recover the free energy landscape

Options

Examples

Output Format

4. Additional diagnostics

Options

Examples

Output Figures

5. Validating your results

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages