feat(ci): Mythos delta-pass auto-runner (single-actor, OAuth-token)#162
Conversation
Automates the human-driven discover protocol that mythos-gate.yml
currently enforces by label. On every PR that touches a Tier-5
file, runs anthropics/claude-code-action (SHA-pinned) per touched
file with scripts/mythos/discover.md as the prompt and captures a
structured `{verdict: NO_FINDINGS | FINDING}` JSON via the action's
--json-schema input. Posts a sticky <!-- mythos-auto-gate --> PR
comment with per-file results; applies mythos-pass-done on all-pass,
fails the job (without the label) on any FINDING.
Authorization stack (defense-in-depth, "only avrabe can trigger"):
1. Job-level if: requires both `github.actor == 'avrabe'` AND the
immutable `github.actor_id == '10056645'`. Usernames can be
reassigned after account deletion; numeric IDs cannot.
2. Trigger is pull_request (not pull_request_target). GitHub's
default policy keeps secrets away from fork-repo PRs.
3. claude-code-action pinned by full commit SHA, not the floating
v1 tag. Hijacking the tag does not change what we run.
4. Explicit minimal permissions: pull-requests write (sticky comment
+ label), contents read.
5. concurrency: cancel-in-progress per PR head — no budget burn on
rapid push cycles.
6. Detect job path-shape-validates every Tier-5 file
(^[a-zA-Z0-9/_.-]+$) before piping into the matrix so a hostile
filename cannot inject through ${{ matrix.file }} downstream;
matrix.file is read via env: in run blocks, not direct
interpolation.
Auth flow uses CLAUDE_CODE_OAUTH_TOKEN from avrabe's Max plan; no
separate API billing. Token usage draws from the subscription rate
limit shared with interactive Claude Code use.
Label-only mythos-gate.yml remains source-of-truth — the auto-runner
is one way the label gets applied, not the only way. Contributors
without OAuth access continue using the honor-system flow per
AGENTS.md.
Setup (one-time, on maintainer machine):
claude update # ensure v1.0.44+
claude setup-token # prints CLAUDE_CODE_OAUTH_TOKEN
Then add the token as repo secret CLAUDE_CODE_OAUTH_TOKEN.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
LS-N verification gate
Approved Failed LS entries(none) Missing regression tests
Updated automatically by |
Admin-merge per #139 (smithy capacity)9 checks green + 2 expected skips ( This is the same admin-merge case as PR #161 yesterday. The workflow added here is single-actor-scoped (only avrabe can trigger it), and the new Admin-merge counter for #139 since last reset:
Will track the reset back into #139 after merge. |
…#163) PR #156 fixed the `imp.name.ends_with(rn)` suffix-collision bug in `Merger::add_unresolved_imports` (the dedup-skip path propagating `resource_rep_by_component` / `resource_new_by_component` entries) but landed without a regression test. The LS-N verification gate (#161) surfaced this gap as missing coverage on the next-to-last sweep. Extracts the exact-match lookup into a private helper `find_exact_resource_import_idx` and adds three regression tests: - `ls_a_19_exact_match_picks_float_not_bigfloat` — both `float` and `bigfloat` in tracking; asking for `[resource-rep]float` must return float's index, not bigfloat's. The buggy `ends_with` form would match bigfloat under some iteration orders. - `ls_a_19_no_match_returns_none_even_with_suffix_collision` — only bigfloat in tracking, caller asks for plain float. Exact match must return None; the buggy `ends_with` form would return bigfloat's index. - `ls_a_19_resource_new_lookup_is_also_exact` — same suffix- collision case for the `[resource-new]` table. LS-N gate result moves from 15/19 verified (4 missing) to 16/19 verified (3 missing). Remaining missing-bucket entries are LS-CP-4 (likely subsumed by #130 Phase 2), LS-A-8, LS-A-9 — tracked separately as research items. This is also the first real PR exercising the `mythos-auto.yml` workflow added in #162: it touches a Tier-5 file (`merger.rs`) so the auto-runner will fire end-to-end on PR open. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
PR #163 was the first end-to-end test of mythos-auto.yml (added in #162). It surfaced three plumbing issues: 1. The action's `oven-sh/setup-bun` step requires `unzip`, which is not installed by default on the rust-cpu runners. Without it the action's bun-based post-step entrypoints exit 127, and the whole scan-step exits failure before emitting structured output. 2. The `Slugify file path for artifact name` step sat AFTER the discover step with no `if: always()`. When discover failed, the slug step was skipped, leaving `steps.slug.outputs.slug` empty. Downstream `if: always()` steps then wrote `mythos-out/.json` (no slug) and `upload-artifact` complained "No files were found with the provided path: mythos-out/.json". 3. The `Save structured output as artifact` step embedded `${{ steps.slug.outputs.slug }}` in the run-block via direct interpolation. Silently substituting an empty slug into a file path is a footgun even if the slug step had run — better to read slug from an env var and fail loudly on empty. Fixes: - Slugify step moves BEFORE the discover step, so it always runs (no `if: always()` needed because both detect+slug are the precondition for everything below). - New `Install unzip (required by setup-bun)` step, best-effort apt install mirroring the action's own subprocess-isolation install pattern. `continue-on-error: true` so non-Debian runners don't break the workflow. - `Save structured output as artifact` reads slug from env (`SLUG`) rather than `${{ }}` interpolation; explicitly errors out if SLUG is empty rather than silently writing to a malformed path. - `upload-artifact` step gains an extra `steps.slug.outputs.slug != ''` guard so it never tries to upload with an empty name. The placeholder-FINDING fallback (the part that surfaced these issues by writing "discover step failed before emitting structured output" into the aggregate comment) is intentional and stays — it guarantees the gate blocks on workflow failure rather than silently passing. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
P3 cross-component stream-pair detection foundation + a fully operational Mythos delta-pass auto-runner. 12 commits since v0.8.1. Headline changes: - Cross-component stream<T> pairing detection (#141, ADR-3). The StreamPairGraph foundation for the in-module stream adapter: meld now inventories at resolve time which fused components form producer -> consumer stream pairings. The ring-buffer / copy-chain emitter is a runtime-verified follow-up (ADR-3 Path N). - Mythos delta-pass auto-runner (#162, #164, #170, #173, #175). The AI-driven discover protocol now runs automatically on every Tier-5 PR by the maintainer, via claude-code-action on a Max-plan OAuth token. Five plumbing fixes brought it to a working end-to-end state: scan -> NO_FINDINGS verdict -> sticky comment -> mythos-pass-done label. - LS-N verification gate (#161, #165). Every approved loss-scenario in safety/stpa/loss-scenarios.yaml is now enforced to have a matching ls_<letter>_<num>_* regression test; 19/19 verified. - DWARF / witness-mapping discovery (#131) — Phase 1 of the #130 epic; pins today's lossy passthrough as the green-to-red oracle for the Phase 2 remap work. - Regression coverage for LS-A-8/9/19 and LS-CP-4 (#163/165/166/169) — closed every missing-test entry the LS-N gate surfaced. - CI footprint reduction (#171) — bench/fuzz/ci skip on docs- and safety-only PRs; meld is a leaner consumer of the shared fleet. - fuzz.yml musl-target drop (#170, closes #168) — fixes the recurring "sanitizer incompatible with statically linked libc" fuzz failures. Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Automates the Mythos discover protocol that
mythos-gate.ymlcurrently enforces by label only. On every PR that touches a Tier-5
file,
anthropics/claude-code-action(SHA-pinned) runs against eachtouched file with
scripts/mythos/discover.mdas the prompt, emitsa structured JSON verdict (
NO_FINDINGSorFINDING), and theaggregate job posts a sticky
<!-- mythos-auto-gate -->PR commentmythos-pass-doneon all-pass.Authorization stack — "only avrabe can trigger this"
if: github.actor == 'avrabe' && github.actor_id == '10056645'pull_request(notpull_request_target)claude-code-actionpinned by commit SHA51ea8ea7...v1doesn't change what we runpermissions:(PR write, contents read)concurrency: cancel-in-progressper PR head${{ matrix.file }}interpolation injection blocked even if a hostile filename slips throughPhase A — your one-time setup
Then in browser: Repo Settings → Secrets and variables → Actions → New repository secret
CLAUDE_CODE_OAUTH_TOKENOnce added, mark this PR ready for review and the workflow will fire on the next push.
Files
.github/workflows/mythos-auto.yml— workflow (detect → scan matrix → aggregate)AGENTS.md— new "Auto-runner" subsection under Mythos pipelineCHANGELOG.md—[Unreleased] / AddedentryHow this fits with
mythos-gate.ymlmythos-gate.yml(label-only check) stays as source of truth.The auto-runner is one way the
mythos-pass-donelabel getsapplied — not the only way. Contributors without OAuth access (or
non-avrabe actors) continue to use the documented honor-system flow:
run discover.md in a fresh Claude Code session, post findings/NO
FINDINGS comment, apply label manually.
Test plan
actionlintif available)avrabe: workflow runs, posts comment, applies/withholds label per verdictif:fails); no token leaked, no comment postedany=false, downstream jobs skip cleanlymeld-core/src/parser.rs;evildoesn't pass the path-shape filter and is logged as a warningCost / quota note
Token usage draws from the Max-plan subscription quota, shared with
interactive Claude Code use. A burst of Tier-5 PRs could starve
interactive sessions during the same window. Refresh-token gap
tracked at anthropics/claude-code-action#727.
🤖 Generated with Claude Code