Skip to content

Khomyakov-Vladimir/llm-context-window-governance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

70 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Information-Geometric Context Window Governance for Large Language Models via Observer Entropy and the Cognitive Phase Law (CPL 4.0) (2.0).

DOI
Dashboard
License: CC BY 4.0 β€” Scientific article and associated documentation
License: MIT β€” Source code and simulation scripts

Based on: Khomyakov, V. (2026). Information-Geometric Context Window Governance for Large Language Models via Observer Entropy and the Cognitive Phase Law (CPL 4.0) (2.0). Zenodo.
doi:10.5281/zenodo.19177363


πŸ”¬ Interactive Dashboard (Live Simulation)

An interactive CPL Context Governor simulation is available online:

πŸ‘‰ https://khomyakov-vladimir.github.io/llm-context-window-governance/

The dashboard implements discrete Eqs. (9), (11), (14) with policy (18–19) and allows real-time parameter exploration (Οƒ, ρ, Ξ±_tight, ΞΊ_L, Ξ³, T).


Problem Statement

Large Language Models exhibit semantic drift, coherence loss, and latency spikes as the context window fills up. Existing mitigation strategies β€” sliding windows, RoPE extensions, fixed-interval summarization β€” operate on heuristic length thresholds and ignore the model's internal behavioral state. This framework applies the Cognitive Phase Law (CPL 4.0) to trigger context compression only when the model actually enters a degraded phase, replacing blind length-based policies with state-aware governance.


Core Idea

CPL 4.0 defines three cognitive phases β€” Coherence (C), Reorganization (R), and Fragmentation (F) β€” through observable entropy, semantic stability, and the rate of entropy change. This framework maps that logic onto the LLM inference loop: at each request–response turn, the model's phase is classified, and context management actions plus decoding parameters are selected accordingly.

Agent State (per step k)

x_k = (L_k, Δ€_k, Ŝ_k, DΜ‚_k, z_k, c_k)
Component Description
L_k Context length (tokens)
Δ€_k Observer entropy (KL divergence of output distribution from its coarse-grained projection: S_obs(p_ΞΈ, Ξ΅))
Ŝ_k Semantic stability = 1 βˆ’ Δ€_k / S_obs^max ∈ [0, 1] (algebraically coupled to observer entropy via Definition 3 / Corollary 2.4 of the preprint)
DΜ‚_k Discrete entropy derivative: `
z_k Current phase: C / R / F
c_k Compressed memory size

Phase Classifier

Phases are assigned using the "Reorganization first" priority rule:

if  DΜ‚_k β‰₯ Ξ³                        β†’  z_k = R   (Reorganization)  
elif Δ€_k < H_c  and  Ŝ_k > S_c    β†’  z_k = C   (Coherence)  
else                                β†’  z_k = F   (Fragmentation)  

Canonical thresholds (from CPL 4.0 human-observer calibration; must be recalibrated from operational LLM logs before production deployment):
H_c = ln(3) β‰ˆ 1.099 nats, S_c = 0.7, Ξ³ = 0.1.

Context Management Policy

The action at each step is determined jointly by the phase and the current context length:

Condition Action (m_k) Token release
z_k = F chunk (extract + compress) r_recover (aggressive)
z_k = R summarize r_rescue
z_k = C and L_k > L_warn summarize r_rescue
z_k = C and L_k ≀ L_warn keep (no action) 0

Context thresholds are ordered as: L_recover < L_warn < L_cap < L_practical ≀ L_max.

Adaptive Decoding

if z_k ∈ {R, F}  or  L_k > L_warn:  
    ΞΈ_k = ΞΈ_tight      # reduced temperature / top-p  
else:  
    ΞΈ_k = ΞΈ_base        # standard parameters  

Formal Guarantees

  1. Hard context invariant. Given correctly configured thresholds, L_k ≀ L_cap for all k β€” the model never enters the catastrophic-latency regime.
  2. Entropy contraction. Under tight decoding, expected entropy decreases by at least Ξ± Β· Ξ” per step while Δ€_k exceeds its target by Ξ”.
  3. Bounded degradation. The number of Fragmentation steps grows as O(√T), not linearly with the horizon T.

See the preprint for complete proofs and assumptions.

Inference Pipeline Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  
β”‚              Inference Loop                  β”‚  
β”‚                                              β”‚  
β”‚  1. Receive user request  (U_k tokens)       β”‚  
β”‚  2. Compute surrogates:                      β”‚  
β”‚     Β· Δ€_k  (observer entropy S_obs(p_ΞΈ, Ξ΅))  β”‚  
β”‚     Β· Ŝ_k = 1 βˆ’ Δ€_k / S_obs^max              β”‚  
β”‚     Β· DΜ‚_k = |Δ€_k βˆ’ Δ€_{kβˆ’1}|                  β”‚  
β”‚  3. Classify phase  z_k                      β”‚  
β”‚  4. Select action  (m_k, ΞΈ_k)                β”‚  
β”‚  5. Apply context management:                β”‚  
β”‚     Β· keep / summarize / chunk+retrieve      β”‚  
β”‚  6. Set decoding parameters                  β”‚  
β”‚  7. Generate response  (Y_k tokens)          β”‚  
β”‚  8. Update  L_{k+1}, Δ€_{k+1}, Ŝ_{k+1}        β”‚  
β”‚  9. Log metrics                              β”‚  
β”‚                                              β”‚  
β”‚  β†’ Repeat for step k+1                       β”‚  
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  

Implementation Note

The phase classifier can be integrated into any inference pipeline (Transformers, LangChain, vLLM, etc.) in a few lines:

def get_cpl_phase(  
    entropy: float,  
    stability: float,  
    delta_entropy: float,  
    H_c: float = 1.099,  
    S_c: float = 0.7,  
    gamma: float = 0.1,  
) -> str:  
    """Classify the current cognitive phase per CPL 4.0.  

    Args:  
        entropy:      Observer entropy S_obs(p_ΞΈ, Ξ΅) β€” KL divergence from coarse-grained projection (Δ€_k).
        stability:    Semantic stability Ŝ_k = 1 βˆ’ Δ€_k / S_obs^max ∈ [0, 1]
                      (algebraically coupled to observer entropy; see Definition 3 of the preprint).  
        delta_entropy: |Δ€_k βˆ’ Δ€_{kβˆ’1}|, discrete entropy derivative.  
        H_c:          Entropy threshold (default: ln 3 β‰ˆ 1.099 nats; CPL 4.0 human-observer value β€”
                      must be recalibrated from operational LLM logs for production use).  
        S_c:          Stability threshold (default: 0.7).  
        gamma:        Entropy-rate threshold (default: 0.1).  

    Returns:  
        "C" (Coherence), "R" (Reorganization), or "F" (Fragmentation).  
    """  
    if delta_entropy >= gamma:  
        return "R"  # Reorganization β€” entropy is changing rapidly  
    elif entropy < H_c and stability > S_c:  
        return "C"  # Coherence β€” stable, low-entropy regime  
    return "F"      # Fragmentation β€” degraded state  


def select_action(phase: str, context_length: int, L_warn: int):  
    """Select context-management action and decoding mode.  

    Returns:  
        (action, decoding_mode) where action ∈ {keep, summarize, chunk}  
        and decoding_mode ∈ {base, tight}.  
    """  
    if phase == "F":  
        return "chunk", "tight"  
    elif phase == "R":  
        return "summarize", "tight"  
    elif context_length > L_warn:  
        return "summarize", "tight"  
    return "keep", "base"  

Note. The canonical thresholds (H_c, S_c, Ξ³) are derived from human-observer calibration in CPL 4.0. For production LLM deployment, these values must be recalibrated from operational logs. Empirical validation remains future work β€” see the preprint for a full discussion of assumptions and limitations.

Calibration

All thresholds (H_c, S_c, Ξ³, L_warn, L_cap, Ξ΄, Οƒ) are designed to be calibrated from operational telemetry. The preprint provides the formal structure; empirical fitting to specific models and workloads is left as future work.

References

Author identification and project information:
ORCID: 0009-0006-3074-9145

License

  • Scientific article and associated documentation (PDF, figures, LaTeX sources):
    CC BY 4.0
  • Source code and simulation scripts:
    MIT License

About

Phase-aware context window governance framework for Large Language Models (LLMs). Provides invariant context length caps, entropy-based stability control, and probabilistic guarantees against degraded (fragmentation) states for production LLM systems.

Topics

Resources

License

Stars

Watchers

Forks

Contributors