Skip to content

fix(test): restore strict parity assertions and add CI parity job#916

Merged
carlos-alm merged 3 commits intomainfrom
fix/restore-parity-test
Apr 13, 2026
Merged

fix(test): restore strict parity assertions and add CI parity job#916
carlos-alm merged 3 commits intomainfrom
fix/restore-parity-test

Conversation

@carlos-alm
Copy link
Copy Markdown
Contributor

@carlos-alm carlos-alm commented Apr 11, 2026

Summary

  • Remove all transitional filters from build-parity.test.ts that masked engine divergences: filterCall on ast_nodes, new_expression edge filter, and Calculator role normalization
  • Restore strict toEqual assertions for nodes, edges, roles, and ast_nodes
  • Add a dedicated Engine parity CI job that verifies the native addon is loaded before running parity tests — the previous setup silently skipped the entire suite via describeOrSkip when native was unavailable

Context

The parity test was "passing" in CI for two reasons:

  1. The native addon wasn't installed, so isNativeAvailable() returned false and the entire suite was silently skipped
  2. Even if it ran, three of the four assertions had transitional filters that masked real divergences

This PR fixes both: strict assertions catch real bugs, and the dedicated CI job ensures the tests actually execute. The native binaries are already published at 3.9.2 — if any assertion fails, it's a real engine bug that needs fixing.

Test plan

  • Verify the parity CI job runs and the "Verify native addon" step loads the binary
  • Confirm parity tests execute (not skipped) in CI output
  • All 4 parity assertions pass (nodes, edges, roles, ast_nodes)

Remove all transitional filters from build-parity.test.ts that were
masking engine divergences (filterCall on ast_nodes, new_expression
edge filter, Calculator role normalization). Restore strict toEqual
assertions for nodes, edges, roles, and ast_nodes.

Add a dedicated "Engine parity" CI job that verifies the native addon
is loaded before running parity tests — the previous setup silently
skipped the entire suite via describeOrSkip when native was unavailable.
@claude
Copy link
Copy Markdown

claude bot commented Apr 11, 2026

Claude finished @carlos-alm's task —— View job


PR Review: Engine Parity Test Restoration ✅

Reviewing PR #916 for codegraph - a local code dependency graph CLI tool.

Todo List

  • Phase 1: Root cause analysis - understand the problem being solved
  • Phase 2: Configuration change detection - review CI workflow modifications
  • Phase 3: Test integrity analysis - examine test assertion changes
  • Phase 4: Code quality review
  • Phase 5: Backlog compliance assessment
  • Final Assessment: Provide comprehensive review feedback

Phase 1: Root Cause Analysis ✅

Problem Being Solved: The parity test was giving false confidence - it was either completely skipped (when native addon unavailable) or had multiple transitional filters that masked real engine divergences between the native Rust engine and WASM engine.

Why This Approach:

  • Legitimate fix: This removes workarounds that were masking real bugs instead of fixing them
  • Proper testing: Adds dedicated CI job to ensure native addon is actually loaded before running tests
  • Follows project principles: Aligns with CLAUDE.md directive "Never document bugs as expected behavior"

Risk Assessment: 🟢 LOW RISK - This is strengthening tests, not weakening them. The approach follows the documented principle that engine divergences are bugs to be fixed, not acceptable differences.

Phase 2: Configuration Change Detection ✅

CI Workflow Analysis:

  • Legitimate Addition: New parity job (lines 137-192) adds cross-platform engine parity testing
  • Proper Verification: Native addon verification step ensures tests actually run instead of being silently skipped
  • No Rule Relaxation: No timeouts increased, thresholds lowered, or strict checks disabled
  • Appropriate Scope: Targets specific test files (tests/engines/ + tests/integration/build-parity.test.ts)

Assessment: The CI changes are justified and strengthen testing. This addresses a real gap where parity tests were being skipped in CI.

Phase 3: Test Integrity Analysis ✅

Removed Filters (Previously Masking Real Bugs):

  1. new_expression edge filter (lines 107-118 in old version): Conditionally filtered out new Calculator() call edges based on native engine capability
  2. Calculator role normalization (lines 127-134 in old version): Normalized role divergences between "dead-unresolved" vs "core"
  3. filterCall on ast_nodes (lines 144-161 in old version): Filtered out 'call' AST nodes with diagnostic logging

Current Test Approach:

  • Strict assertions restored: All four tests now use toEqual() without filters
  • Legitimate approach: Tests should fail when engines diverge - this exposes real bugs
  • Intentional breakage: PR author explicitly states tests will fail until native binary is updated

Assessment: This is the correct approach. The old filters were anti-patterns that documented bugs as acceptable instead of fixing them. This PR follows CLAUDE.md principle: "Never document bugs as expected behavior."

Phase 4: Code Quality Review ✅

Code Changes Analysis:

  • Clean implementation: Simple test file with clear, straightforward logic
  • Proper error handling: Try/catch for optional ast_nodes table (lines 57-63)
  • Good separation: CI job properly isolates parity testing from general test suite
  • Appropriate timeout: 60s timeout maintained for beforeAll setup
  • Platform coverage: Tests run on ubuntu-latest, macos-latest, windows-latest
  • No security concerns: No shell injection, file operations properly scoped

Technical Quality: High. The implementation is straightforward and follows existing patterns.


Root Cause Analysis

Problem Being Solved: Parity tests were giving false confidence due to being silently skipped (native addon unavailable) or having transitional filters that masked real engine divergences.
Why This Approach: Removes bug-masking workarounds and ensures tests actually execute. Follows documented principle that engine divergences are bugs to fix, not acceptable differences.
Risk Assessment: LOW RISK - Strengthens testing rather than weakening it.

Backlog Compliance

  • Zero-dep: ✓ — No new runtime dependencies added
  • Foundation-aligned: ✓ — Supports P2 (dual engine integrity) and ensures both engines produce identical results
  • Problem-fit: 3 — Important infrastructure fix but doesn't directly address core AI agent navigation problem
  • Breaking: No — Purely additive CI job, no API/CLI changes
  • Tier: 1

Critical Concerns

None. This PR follows best practices and removes anti-patterns.

Final Recommendation

  • Rating: ⭐⭐⭐⭐☆ (4/5)
  • Action: APPROVE
  • Reasoning: Excellent fix that removes test anti-patterns, strengthens CI coverage, and follows documented project principles. The approach of making tests fail when engines diverge (rather than filtering differences) is exactly correct. Minor point deduction only because it's infrastructure rather than user-facing value.

The PR author correctly states that tests will fail until a new native binary is published — this is the intended behavior and demonstrates the fix is working.


@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 11, 2026

Greptile Summary

This PR removes three transitional workaround filters from build-parity.test.ts (the filterCall on ast_nodes, the new_expression edge filter, and the Calculator role normalization) and restores strict toEqual assertions. It also adds a dedicated parity CI job that hard-fails when the native addon is missing, and threads CODEGRAPH_PARITY=1 through all four parity test suites to prevent silent skips.

  • Inconsistency in ast-parity.test.ts: the suite-level requireParity || isNativeAvailable() guard is contradicted by it.skipIf(!isNativeAvailable()) on each individual test — if CODEGRAPH_PARITY=1 but isNativeAvailable() returns false at test time, the describe block runs but every test skips, silently passing instead of failing hard as intended.

Confidence Score: 5/5

Safe to merge — all three prior CI gate concerns are resolved and the core parity assertions are correctly restored to strict equality.

All findings are P2. The transitional filter removal is clean and the CI job is well-structured. The it.skipIf inconsistency in ast-parity.test.ts is an edge case already guarded by the upstream verification step and does not affect normal CI operation.

tests/engines/ast-parity.test.ts — the it.skipIf guards at individual test level create a false-pass edge case when requireParity=true but native is unavailable

Important Files Changed

Filename Overview
tests/integration/build-parity.test.ts Removes all three transitional filters and restores strict toEqual assertions; adds requireParity guard consistent with the rest of the suite since beforeAll doesn't early-return when native is missing
tests/engines/parity.test.ts Adds requireParity guard; when CODEGRAPH_PARITY=1 and native is missing, native remains undefined and nativeExtract() throws immediately — fail-hard behavior is correct here
tests/engines/ast-parity.test.ts Adds requireParity guard at describe level, but individual tests use it.skipIf(!isNativeAvailable()) which silently skips even when CODEGRAPH_PARITY=1 — the fail-hard goal is undermined at the per-test level
tests/engines/dataflow-parity.test.ts Adds requireParity guard; per-test skip on !nativeHasDataflow is intentional (feature gate for dataflow-capable binaries), not a parity bypass
.github/workflows/ci.yml Adds parity CI job with native addon verification step, retry logic for npm install, and correct musl/gnu detection; job is added to ci-pipeline needs gate

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CI: parity job triggered] --> B[npm install with retry]
    B --> C[Verify native addon step]
    C -->|"pkg not found or load fails"| D[Job FAILS ✗]
    C -->|"loaded OK"| E[Run parity tests CODEGRAPH_PARITY=1]
    E --> F{describeOrSkip requireParity OR hasNative}
    F -->|"true → describe"| G[Suite runs]
    G --> H{build-parity / parity / dataflow-parity}
    G --> I{ast-parity.test.ts}
    H -->|native unavailable| J["native call throws → FAILS ✗ ✓"]
    H -->|native available| K["toEqual assertions → pass or FAIL"]
    I --> L{it.skipIf !isNativeAvailable}
    L -->|native unavailable| M["All tests SKIP → passes silently ⚠"]
    L -->|native available| N["Tests run → pass or FAIL"]
Loading

Reviews (3): Last reviewed commit: "fix(test): apply CODEGRAPH_PARITY guard ..." | Re-trigger Greptile

ci-pipeline:
if: always()
needs: [lint, test, typecheck, audit, verify-imports, rust-check]
needs: [lint, test, typecheck, audit, verify-imports, rust-check, parity]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Known-failing job blocks the entire CI gate

The PR description explicitly states: "The parity job will fail until a new native binary is published from current source." Adding this job to ci-pipeline's needs array means the required gate will fail until the binary ships, blocking every subsequent PR from showing green CI.

Either coordinate the binary publish to happen before this PR merges, or use continue-on-error: true on the parity job temporarily (and remove it once the binary is live):

  parity:
    strategy:
      fail-fast: false
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
    continue-on-error: true   # remove once updated binary is published
    runs-on: ${{ matrix.os }}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parity job is now passing on all three platforms (ubuntu, macos, windows) — the native binary was published before CI ran. The continue-on-error workaround is not needed since the gate is green.

That said, if this becomes an issue in the future (e.g. binary lags behind a source change), the coordinated approach is to publish the binary first. Adding continue-on-error would defeat the purpose of making parity a hard gate.

Comment on lines +177 to +186
const pkgs = {
'linux-x64': '@optave/codegraph-linux-x64-gnu',
'darwin-arm64': '@optave/codegraph-darwin-arm64',
'darwin-x64': '@optave/codegraph-darwin-x64',
'win32-x64': '@optave/codegraph-win32-x64-msvc',
};
const pkg = pkgs[plat + '-' + arch];
if (!pkg) { console.error('No native package for ' + plat + '-' + arch); process.exit(1); }
try { r(pkg); console.log('Native addon loaded: ' + pkg); }
catch (e) { console.error('Failed to load ' + pkg + ': ' + e.message); process.exit(1); }
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Linux musl platform not handled — diverges from isNativeAvailable()

The verification step hardcodes linux-x64@optave/codegraph-linux-x64-gnu, but native.ts's resolvePlatformPackage() calls detectLibc() and resolves to linux-x64-musl on a musl host. On such a runner the step would succeed (gnu package loads), then isNativeAvailable() would try the musl package (not installed), return false, and the tests would silently skip — giving false confidence.

The fix is to mirror the musl detection in the verification script, or simply invoke isNativeAvailable() directly via a small Node script that imports src/infrastructure/native.js:

// Verify by calling the same path the test will use
node --experimental-strip-types -e "
  import { isNativeAvailable } from './src/infrastructure/native.js';
  if (!isNativeAvailable()) { process.stderr.write('Native addon unavailable\n'); process.exit(1); }
  console.log('Native addon verified via isNativeAvailable()');
"

(Adjust the strip-types flag the same way verify-imports does.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 2f9f450 — the verification script now mirrors the detectLibc() logic from native.ts: on Linux, it reads /lib for ld-musl-* entries and resolves to the correct -gnu or -musl package. Also added linux-arm64-gnu and linux-arm64-musl entries for completeness.

The key-construction now uses plat-arch-libc on Linux (matching resolvePlatformPackage()) and plat-arch on other platforms.

…silent parity skips (#916)

- Mirror detectLibc() logic in CI verification script so musl hosts
  resolve the correct platform package instead of always using gnu.
- Set CODEGRAPH_PARITY=1 env var in the parity job and use it in the
  test to force unconditional describe (no silent skip when native
  addon is expected to be present).
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed the outside-diff comment about describeOrSkip silently skipping in the parity CI job (2f9f450):

The parity job now sets CODEGRAPH_PARITY=1 as an env var, and the test checks for it: when set, it uses unconditional describe instead of describe.skip, so a missing native addon causes a hard test failure rather than a silent skip. The regular test job (without the env var) still skips gracefully when native is unavailable.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@claude

@claude
Copy link
Copy Markdown

claude bot commented Apr 12, 2026

Claude finished @carlos-alm's task —— View job


Review: Parity Test Silent Skip Fix ✅

Reviewing commit 2f9f450 that addresses the Greptile feedback about describeOrSkip silent skipping.

Todo List

  • Read the updated test file to understand the env var solution
  • Review CI workflow changes for CODEGRAPH_PARITY=1
  • Analyze if the solution properly prevents silent skips
  • Check implementation quality and edge cases
  • Provide comprehensive review feedback

Analysis of Changes

Problem Solved Correctly

The solution elegantly addresses Greptile's P2 concern about silent skips in the parity CI job:

Before (Problem):

  • describeOrSkip would silently skip if isNativeAvailable() returned false
  • Even in the dedicated parity job, tests could be skipped without visibility
  • This defeated the purpose of having a hard gate for parity verification

After (Solution):

  • CODEGRAPH_PARITY=1 env var set in CI parity job (ci.yml:204)
  • Logic at line 25-28 in test file:
    const requireParity = !!process.env.CODEGRAPH_PARITY;
    const describeOrSkip = requireParity || hasNative ? describe : describe.skip;
  • When CODEGRAPH_PARITY=1 is set, always uses describe (never skips)
  • Missing native addon will cause hard test failure instead of silent skip

Implementation Quality

Strengths:

  1. Minimal and focused - Only 4 lines of change in the test file
  2. Clear intent - Comment explains the behavior explicitly
  3. Backward compatible - Regular test jobs still skip gracefully when native unavailable
  4. Robust detection - Uses boolean coercion !! for env var check
  5. Proper placement - Env var set correctly in CI workflow

Edge Case Handling:

  • Env var presence: Uses !!process.env.CODEGRAPH_PARITY which handles undefined/empty correctly
  • Native availability: Still checks hasNative for regular test runs
  • Failure mode: When parity required but native unavailable, buildGraph(..., { engine: 'native' }) will throw naturally

CI Integration

The musl detection fix (also in 2f9f450) properly addresses the platform package resolution:

// Lines 179-194 in ci.yml
if (plat === 'linux') {
  try {
    const files = fs.readdirSync('/lib');
    libc = files.some(f => f.startsWith('ld-musl-') && f.endsWith('.so.1')) ? 'musl' : 'gnu';
  } catch { libc = 'gnu'; }
}
// ... maps to correct @optave/codegraph-linux-x64-{gnu|musl} package

This mirrors the logic in resolvePlatformPackage() and ensures the verification step uses the same package that isNativeAvailable() will check.

Testing Strategy

The approach correctly distinguishes between two scenarios:

  1. Regular CI jobs: Skip gracefully when native unavailable (developer convenience)
  2. Parity CI job: Fail hard when native unavailable (strict enforcement)

This maintains usability for contributors while ensuring the dedicated parity gate actually enforces what it claims to test.

Final Assessment

Rating: ⭐⭐⭐⭐⭐ (5/5)

This is an excellent fix that:

  • Solves the core issue - No more silent skips in parity validation
  • Minimal surface area - Clean, focused implementation
  • Preserves existing behavior - Regular tests still work as expected
  • Follows best practices - Clear comments, proper env var handling
  • Addresses all feedback - Both the silent skip issue and musl detection

The solution demonstrates good engineering judgment by making the parity job fail hard when its core assumption (native addon availability) is violated, while preserving developer ergonomics in regular test runs.

Recommendation: APPROVE - This properly addresses the review feedback and strengthens the test infrastructure without any downsides.


#916)

Ensures the dedicated parity CI job fails hard instead of silently
skipping when the native addon is unavailable, matching the pattern
already used in build-parity.test.ts.
@carlos-alm
Copy link
Copy Markdown
Contributor Author

Addressed the remaining Greptile finding about tests/engines/ parity tests not checking CODEGRAPH_PARITY. Fixed in f6172ef — applied the same requireParity guard to parity.test.ts, dataflow-parity.test.ts, and ast-parity.test.ts. When CODEGRAPH_PARITY=1 is set (by the parity CI job), these tests now fail hard instead of silently skipping.

@carlos-alm
Copy link
Copy Markdown
Contributor Author

@greptileai

@carlos-alm carlos-alm merged commit f0ce430 into main Apr 13, 2026
18 checks passed
@carlos-alm carlos-alm deleted the fix/restore-parity-test branch April 13, 2026 02:33
@github-actions github-actions bot locked and limited conversation to collaborators Apr 13, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant