Information-Geometric Context Window Governance for Large Language Models via Observer Entropy and the Cognitive Phase Law (CPL 4.0) (2.0).
β Scientific article and associated documentation
β Source code and simulation scripts
Based on: Khomyakov, V. (2026). Information-Geometric Context Window Governance for Large Language Models via Observer Entropy and the Cognitive Phase Law (CPL 4.0) (2.0). Zenodo.
doi:10.5281/zenodo.19177363
An interactive CPL Context Governor simulation is available online:
π https://khomyakov-vladimir.github.io/llm-context-window-governance/
The dashboard implements discrete Eqs. (9), (11), (14) with policy (18β19) and allows real-time parameter exploration (Ο, Ο, Ξ±_tight, ΞΊ_L, Ξ³, T).
Large Language Models exhibit semantic drift, coherence loss, and latency spikes as the context window fills up. Existing mitigation strategies β sliding windows, RoPE extensions, fixed-interval summarization β operate on heuristic length thresholds and ignore the model's internal behavioral state. This framework applies the Cognitive Phase Law (CPL 4.0) to trigger context compression only when the model actually enters a degraded phase, replacing blind length-based policies with state-aware governance.
CPL 4.0 defines three cognitive phases β Coherence (C), Reorganization (R), and Fragmentation (F) β through observable entropy, semantic stability, and the rate of entropy change. This framework maps that logic onto the LLM inference loop: at each requestβresponse turn, the model's phase is classified, and context management actions plus decoding parameters are selected accordingly.
x_k = (L_k, Δ€_k, Ε_k, DΜ_k, z_k, c_k)
| Component | Description |
|---|---|
L_k |
Context length (tokens) |
Δ€_k |
Observer entropy (KL divergence of output distribution from its coarse-grained projection: S_obs(p_ΞΈ, Ξ΅)) |
Ε_k |
Semantic stability = 1 β Δ€_k / S_obs^max β [0, 1] (algebraically coupled to observer entropy via Definition 3 / Corollary 2.4 of the preprint) |
DΜ_k |
Discrete entropy derivative: ` |
z_k |
Current phase: C / R / F |
c_k |
Compressed memory size |
Phases are assigned using the "Reorganization first" priority rule:
if DΜ_k β₯ Ξ³ β z_k = R (Reorganization)
elif Δ€_k < H_c and Ε_k > S_c β z_k = C (Coherence)
else β z_k = F (Fragmentation)
Canonical thresholds (from CPL 4.0 human-observer calibration; must be recalibrated from operational LLM logs before production deployment):
H_c = ln(3) β 1.099 nats, S_c = 0.7, Ξ³ = 0.1.
The action at each step is determined jointly by the phase and the current context length:
| Condition | Action (m_k) |
Token release |
|---|---|---|
z_k = F |
chunk (extract + compress) | r_recover (aggressive) |
z_k = R |
summarize | r_rescue |
z_k = C and L_k > L_warn |
summarize | r_rescue |
z_k = C and L_k β€ L_warn |
keep (no action) | 0 |
Context thresholds are ordered as: L_recover < L_warn < L_cap < L_practical β€ L_max.
if z_k β {R, F} or L_k > L_warn:
ΞΈ_k = ΞΈ_tight # reduced temperature / top-p
else:
ΞΈ_k = ΞΈ_base # standard parameters
- Hard context invariant. Given correctly configured thresholds,
L_k β€ L_capfor allkβ the model never enters the catastrophic-latency regime. - Entropy contraction. Under tight decoding, expected entropy decreases by at least
Ξ± Β· Ξper step whileΔ€_kexceeds its target byΞ. - Bounded degradation. The number of Fragmentation steps grows as
O(βT), not linearly with the horizonT.
See the preprint for complete proofs and assumptions.
ββββββββββββββββββββββββββββββββββββββββββββββββ
β Inference Loop β
β β
β 1. Receive user request (U_k tokens) β
β 2. Compute surrogates: β
β Β· Δ€_k (observer entropy S_obs(p_ΞΈ, Ξ΅)) β
β Β· Ε_k = 1 β Δ€_k / S_obs^max β
β Β· DΜ_k = |Δ€_k β Δ€_{kβ1}| β
β 3. Classify phase z_k β
β 4. Select action (m_k, ΞΈ_k) β
β 5. Apply context management: β
β Β· keep / summarize / chunk+retrieve β
β 6. Set decoding parameters β
β 7. Generate response (Y_k tokens) β
β 8. Update L_{k+1}, Δ€_{k+1}, Ε_{k+1} β
β 9. Log metrics β
β β
β β Repeat for step k+1 β
ββββββββββββββββββββββββββββββββββββββββββββββββ
The phase classifier can be integrated into any inference pipeline (Transformers, LangChain, vLLM, etc.) in a few lines:
def get_cpl_phase(
entropy: float,
stability: float,
delta_entropy: float,
H_c: float = 1.099,
S_c: float = 0.7,
gamma: float = 0.1,
) -> str:
"""Classify the current cognitive phase per CPL 4.0.
Args:
entropy: Observer entropy S_obs(p_ΞΈ, Ξ΅) β KL divergence from coarse-grained projection (Δ€_k).
stability: Semantic stability Ε_k = 1 β Δ€_k / S_obs^max β [0, 1]
(algebraically coupled to observer entropy; see Definition 3 of the preprint).
delta_entropy: |Δ€_k β Δ€_{kβ1}|, discrete entropy derivative.
H_c: Entropy threshold (default: ln 3 β 1.099 nats; CPL 4.0 human-observer value β
must be recalibrated from operational LLM logs for production use).
S_c: Stability threshold (default: 0.7).
gamma: Entropy-rate threshold (default: 0.1).
Returns:
"C" (Coherence), "R" (Reorganization), or "F" (Fragmentation).
"""
if delta_entropy >= gamma:
return "R" # Reorganization β entropy is changing rapidly
elif entropy < H_c and stability > S_c:
return "C" # Coherence β stable, low-entropy regime
return "F" # Fragmentation β degraded state
def select_action(phase: str, context_length: int, L_warn: int):
"""Select context-management action and decoding mode.
Returns:
(action, decoding_mode) where action β {keep, summarize, chunk}
and decoding_mode β {base, tight}.
"""
if phase == "F":
return "chunk", "tight"
elif phase == "R":
return "summarize", "tight"
elif context_length > L_warn:
return "summarize", "tight"
return "keep", "base" Note. The canonical thresholds (
H_c,S_c,Ξ³) are derived from human-observer calibration in CPL 4.0. For production LLM deployment, these values must be recalibrated from operational logs. Empirical validation remains future work β see the preprint for a full discussion of assumptions and limitations.
All thresholds (H_c, S_c, Ξ³, L_warn, L_cap, Ξ΄, Ο) are designed to be calibrated from operational telemetry. The preprint provides the formal structure; empirical fitting to specific models and workloads is left as future work.
- KLEO 2.4.1: Khomyakov, V. (2026). KL-Geometric Structure of Observer Entropy: A Minimal Information-Theoretic Framework (2.4.1). Zenodo https://doi.org/10.5281/zenodo.19202244
- CPL 4.0: Khomyakov, V. (2025). The Law of Cognitive Phases (CPL): Formalizing Observer State Transitions in Subjective Physics (4.0). Zenodo https://doi.org/10.5281/zenodo.17788635
- MMSF 12.0: Khomyakov, V. (2025). Cognitive Projection and Observer Entropy: A Minimal Model of Subjective Physics (12.0). Zenodo https://doi.org/10.5281/zenodo.17407408
- This framework: Khomyakov, V. (2026). Information-Geometric Context Window Governance for Large Language Models via Observer Entropy and the Cognitive Phase Law (CPL 4.0) (2.0). Zenodo https://doi.org/10.5281/zenodo.19177363
Author identification and project information:
ORCID: 0009-0006-3074-9145
- Scientific article and associated documentation (PDF, figures, LaTeX sources):
CC BY 4.0 - Source code and simulation scripts:
MIT License