Skip to content

test(merger): add LS-A-19 regression for exact resource-import lookup#163

Merged
avrabe merged 1 commit into
mainfrom
fix/ls-a-19-exact-resource-import-lookup
May 18, 2026
Merged

test(merger): add LS-A-19 regression for exact resource-import lookup#163
avrabe merged 1 commit into
mainfrom
fix/ls-a-19-exact-resource-import-lookup

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 17, 2026

Summary

Closes the missing-regression gap for LS-A-19 that the LS-N verification gate (#161) surfaced on its first run. PR #156 fixed the imp.name.ends_with(rn) suffix-collision bug in Merger::add_unresolved_imports but landed without a dedicated regression test.

Refactors the dedup-skip path's inline .values().find(...) lookup into a private helper find_exact_resource_import_idx and adds three regression tests pinning the suffix-collision boundaries.

Why this matters

The bug: when two resources shared a suffix (float / bigfloat), the dedup-skip path's imp.name.ends_with("float") matched both [resource-rep]float AND [resource-rep]bigfloat. .find() returned whichever the HashMap iterated first (also LS-A-15 territory). The wrong import index got stored in resource_rep_by_component, routing subsequent float calls on the second component to the bigfloat import. Validator passes (both share (i32) -> i32); runtime sees bigfloat rep as float — type confusion with no trap (H-10).

Tests added

Test What it pins
ls_a_19_exact_match_picks_float_not_bigfloat Both names in tracking; exact match returns the right index regardless of HashMap iteration order
ls_a_19_no_match_returns_none_even_with_suffix_collision Only bigfloat in tracking, ask for float → must return None (buggy ends_with would have returned bigfloat's index)
ls_a_19_resource_new_lookup_is_also_exact Same case for the [resource-new] table

Gate impact

LS-N verification gate verdict moves from 15/19 verified (4 missing)16/19 verified (3 missing).

Remaining missing-bucket entries:

End-to-end test for mythos-auto.yml

This is also the first real PR exercising mythos-auto.yml (added in #162). It touches a Tier-5 file (merger.rs), so the auto-runner should fire end-to-end:

  1. Detect Tier-5 changes job passes the actor gate (avrabe)
  2. Matrix runs claude-code-action on merger.rs with scripts/mythos/discover.md
  3. Aggregate job posts sticky <!-- mythos-auto-gate --> comment
  4. If NO_FINDINGS, auto-applies mythos-pass-done label

Watching for: budget consumption per single-file scan, sticky-comment markdown rendering, label-application timing.

🤖 Generated with Claude Code

PR #156 fixed the `imp.name.ends_with(rn)` suffix-collision bug in
`Merger::add_unresolved_imports` (the dedup-skip path propagating
`resource_rep_by_component` / `resource_new_by_component` entries)
but landed without a regression test. The LS-N verification gate
(#161) surfaced this gap as missing coverage on the next-to-last
sweep.

Extracts the exact-match lookup into a private helper
`find_exact_resource_import_idx` and adds three regression tests:

- `ls_a_19_exact_match_picks_float_not_bigfloat` — both `float` and
  `bigfloat` in tracking; asking for `[resource-rep]float` must
  return float's index, not bigfloat's. The buggy `ends_with` form
  would match bigfloat under some iteration orders.
- `ls_a_19_no_match_returns_none_even_with_suffix_collision` — only
  bigfloat in tracking, caller asks for plain float. Exact match
  must return None; the buggy `ends_with` form would return
  bigfloat's index.
- `ls_a_19_resource_new_lookup_is_also_exact` — same suffix-
  collision case for the `[resource-new]` table.

LS-N gate result moves from 15/19 verified (4 missing) to 16/19
verified (3 missing). Remaining missing-bucket entries are LS-CP-4
(likely subsumed by #130 Phase 2), LS-A-8, LS-A-9 — tracked
separately as research items.

This is also the first real PR exercising the `mythos-auto.yml`
workflow added in #162: it touches a Tier-5 file (`merger.rs`) so
the auto-runner will fire end-to-end on PR open.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

Mythos delta-pass required

This PR modifies one or more Tier-5 source files (per
scripts/mythos/rank.md):

meld-core/src/merger.rs

Before merge, run the Mythos discover protocol on the
modified Tier-5 files:

  1. Follow scripts/mythos/discover.md
    — one fresh agent session per touched Tier-5 file.
  2. For each finding, the agent must produce both a Kani
    harness and a failing PoC test (per the protocol's
    "if you cannot produce both, do not report" rule).
  3. Attach a comment on this PR with either the findings
    (formatted per discover.md's output schema) or
    NO FINDINGS.
  4. Add the mythos-pass-done label to this PR.

Why this gate exists: LS-A-10
(CABI alignment padding in async-lift retptr writeback) was
found by the v0.8.0 pre-release Mythos pass — but it had
lived in the callback emitter since #128, across six
releases. A PR-time gate would have caught it at review
time instead of at the release boundary.

The gate check on this PR will pass once the label is
applied.

@github-actions
Copy link
Copy Markdown

LS-N verification gate

⚠️ 16/19 verified — 3 missing regression tests

count
Passed (≥1 test, all green) 16
Failed (≥1 test failure) 0
Missing (no ls_*_NN_* test found) 3

Approved loss-scenarios.yaml entries are expected to have a
regression test named ls_<letter>_<num>_* (e.g. LS-A-11
ls_a_11_*). The gate runs each prefix via cargo test --lib --no-fail-fast and aggregates pass/fail/missing.

Failed LS entries

(none)

Missing regression tests
  • LS-CP-4
  • LS-A-8
  • LS-A-9

Updated automatically by tools/post_verification_comment.py.
Source of truth: safety/stpa/loss-scenarios.yaml.

@avrabe
Copy link
Copy Markdown
Contributor Author

avrabe commented May 18, 2026

Admin-merging + plumbing follow-up

Two distinct issues blocked the PR's CI gates; only the first one is a real test failure that would block merge.

#139 capacity (real)

3 standard CI jobs still queued (Clippy + 2 fuzz smokes) after ~2 hours. After cancelling 5 hung cross-org jobs (loom × 3 + spar × 2, ages 9-22h, matching #139 §4 "hung jobs hold slots forever" pattern), capacity partially recovered but the queue cleared in zigzag.

mythos-auto.yml plumbing (will fix in #164)

The Mythos pass (meld-core/src/merger.rs) job failure is workflow plumbing, NOT a Mythos finding:

14:32:12 Error: Unable to locate executable file: unzip. Please verify either the file path exists...
14:32:12 /var/lib/runners/runner5/_work/_temp/.../d9f8a6f8.sh: line 3: bun: command not found
14:32:12 ##[error]Process completed with exit code 127.

oven-sh/setup-bun failed because the rust-cpu runner lacks unzip. The action's fallback to a system bun also failed because there is no system bun. claude-code-action's bun-based post-step entrypoints then exit 127.

The placeholder-FINDING fallback we wired in for exactly this scenario fired correctly, but the slug step output was empty (separate plumbing bug), producing mythos-out/.json as the artifact path.

Why admin-merge is OK here

Admin-merge counter for #139:

Follow-up PR #164 (next) will fix mythos-auto plumbing:

  1. Pre-install unzip (and verify bun once setup-bun works)
  2. Debug the empty-slug step (likely matrix-expression evaluation order)
  3. Confirm OAuth-token wiring is the right input for non-interactive use

@avrabe avrabe merged commit 84a1ed9 into main May 18, 2026
10 of 14 checks passed
@avrabe avrabe deleted the fix/ls-a-19-exact-resource-import-lookup branch May 18, 2026 05:03
avrabe added a commit that referenced this pull request May 18, 2026
PR #163 was the first end-to-end test of mythos-auto.yml (added in
#162). It surfaced three plumbing issues:

1. The action's `oven-sh/setup-bun` step requires `unzip`, which is
   not installed by default on the rust-cpu runners. Without it the
   action's bun-based post-step entrypoints exit 127, and the whole
   scan-step exits failure before emitting structured output.

2. The `Slugify file path for artifact name` step sat AFTER the
   discover step with no `if: always()`. When discover failed, the
   slug step was skipped, leaving `steps.slug.outputs.slug` empty.
   Downstream `if: always()` steps then wrote
   `mythos-out/.json` (no slug) and `upload-artifact` complained
   "No files were found with the provided path: mythos-out/.json".

3. The `Save structured output as artifact` step embedded
   `${{ steps.slug.outputs.slug }}` in the run-block via direct
   interpolation. Silently substituting an empty slug into a file
   path is a footgun even if the slug step had run — better to read
   slug from an env var and fail loudly on empty.

Fixes:

- Slugify step moves BEFORE the discover step, so it always runs
  (no `if: always()` needed because both detect+slug are the
  precondition for everything below).
- New `Install unzip (required by setup-bun)` step, best-effort
  apt install mirroring the action's own subprocess-isolation
  install pattern. `continue-on-error: true` so non-Debian runners
  don't break the workflow.
- `Save structured output as artifact` reads slug from env (`SLUG`)
  rather than `${{ }}` interpolation; explicitly errors out if SLUG
  is empty rather than silently writing to a malformed path.
- `upload-artifact` step gains an extra `steps.slug.outputs.slug
  != ''` guard so it never tries to upload with an empty name.

The placeholder-FINDING fallback (the part that surfaced these
issues by writing "discover step failed before emitting structured
output" into the aggregate comment) is intentional and stays — it
guarantees the gate blocks on workflow failure rather than silently
passing.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
avrabe added a commit that referenced this pull request May 21, 2026
P3 cross-component stream-pair detection foundation + a fully
operational Mythos delta-pass auto-runner. 12 commits since v0.8.1.

Headline changes:

- Cross-component stream<T> pairing detection (#141, ADR-3). The
  StreamPairGraph foundation for the in-module stream adapter: meld
  now inventories at resolve time which fused components form
  producer -> consumer stream pairings. The ring-buffer / copy-chain
  emitter is a runtime-verified follow-up (ADR-3 Path N).

- Mythos delta-pass auto-runner (#162, #164, #170, #173, #175). The
  AI-driven discover protocol now runs automatically on every
  Tier-5 PR by the maintainer, via claude-code-action on a Max-plan
  OAuth token. Five plumbing fixes brought it to a working
  end-to-end state: scan -> NO_FINDINGS verdict -> sticky comment ->
  mythos-pass-done label.

- LS-N verification gate (#161, #165). Every approved loss-scenario
  in safety/stpa/loss-scenarios.yaml is now enforced to have a
  matching ls_<letter>_<num>_* regression test; 19/19 verified.

- DWARF / witness-mapping discovery (#131) — Phase 1 of the #130
  epic; pins today's lossy passthrough as the green-to-red oracle
  for the Phase 2 remap work.

- Regression coverage for LS-A-8/9/19 and LS-CP-4 (#163/165/166/169)
  — closed every missing-test entry the LS-N gate surfaced.

- CI footprint reduction (#171) — bench/fuzz/ci skip on docs- and
  safety-only PRs; meld is a leaner consumer of the shared fleet.

- fuzz.yml musl-target drop (#170, closes #168) — fixes the
  recurring "sanitizer incompatible with statically linked libc"
  fuzz failures.

Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant