Skip to content

CFFI backend parity gaps: garbled ZstdError messages, __exit__ ordering mismatch with C backend, minor parameter-validation drift #301

@devdanzin

Description

@devdanzin

Summary

The pure-Python CFFI backend has several small-but-real parity gaps with the C extension backend. The most visible is that several ZstdError constructors are called with positional args as if %-formatting were going to happen, but it doesn't — so user-facing error messages look like garbled tuples. Two additional gaps affect resource-lifecycle invariants and parameter validation.

Impact

  • Severity: User-visible garbled error messages (main issue); resource-lifecycle invariant violation in __exit__ (secondary); small parameter-validation drift.
  • Reachability: Any user running the CFFI backend — typically PyPy, or environments where the C extension can't be built.
  • Version: 0.25.0 (commit 7a77a75).
  • Note: Report was produced without a live CFFI environment available; the below is based on static review of zstandard/backend_cffi.py. Happy to verify with a concrete reproducer if you set up a PyPy / CFFI-only test environment.

Gap 1: Tuple-args to ZstdError — garbled error messages

Pattern: raise ZstdError("... %s", error) passes two positional args to the exception constructor, making .args == ("... %s", error). No %-formatting happens. The rendered message looks like ('..., %s', <something>) instead of the intended interpolated string.

Sites: zstandard/backend_cffi.py:1531, 1575, 1682.

Fix: raise ZstdError("... %s" % error) or an f-string.

Gap 2: __exit__ ordering mismatch with the C backend

C backend's __exit__ calls close() first, then clears the compressor/decompressor field. CFFI backend sets _compressor = None first, then calls close() — which runs with the invariants already broken. Any reference to self._compressor inside close() observes None.

Sites: zstandard/backend_cffi.py:1413, 3161.

Fix: Reorder to match the C backend:

def __exit__(self, exc_type, exc_value, tb):
    self.close()                    # first
    self._compressor = None         # then clear
    return False

Gap 3: Minor parameter-validation drift

  • Off-by-one in level-validation error message. CFFI says "less than 22"; C says "less than 23". One of them has the boundary wrong (if the C backend's boundary is correct — ZSTD_maxCLevel() returns 22 — then both messages should say "less than 23" in the "strictly-less-than" formulation or "more than 22" in the symmetric one).
  • compressobj(size=-1) is accepted by CFFI but raises OverflowError in C — signed/unsigned mismatch somewhere in the CFFI argument handling.
  • CFFI backend does not declare Py_MOD_GIL_NOT_USED (doesn't apply — it's pure Python), but has no FT-story either; worth either documenting or gating free-threaded wheels to C-backend-only.

Suggested PR shape

All three gaps are one small PR (pure-Python fixes, likely ~20 lines of diff). If you'd prefer to keep C-side and CFFI-side PRs separate I can do that — otherwise one combined parity PR seems cleanest.

Methodology

Found via cext-review-toolkit — the parity-checker agent identifies places where two implementations of the same interface diverge. Gaps 1 and 2 were flagged via structural pattern matching against the C backend; Gap 3 via argument-validation diffing. The CFFI environment wasn't available during the analysis, so none of the above was live-reproduced; verification is recommended but the diffs are small and the patterns are unambiguous on inspection.

Discovery, root-cause analysis, and issue drafting were performed by Claude Code and reviewed by a human before filing.

Full report

Complete multi-agent analysis (48 FIX findings across 13 categories, plus a reproducer appendix): https://gist.github.com/devdanzin/b86039ac097141579590c1a0f3a43605

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions