Skip to content

chore: release - merge dev into main#1293

Merged
zbigniewsobiecki merged 17 commits into
mainfrom
dev
May 9, 2026
Merged

chore: release - merge dev into main#1293
zbigniewsobiecki merged 17 commits into
mainfrom
dev

Conversation

@zbigniewsobiecki
Copy link
Copy Markdown
Member

Automated release PR created by the release workflow.

Commits (17):

ff5048ff Merge pull request #1292 from mongrel-intelligence/fix/agent-contract-robustness
56b56391 fix(gadgets): truncate hook output in error messages on commit/push failure
63d821fd fix(backends): distinguish empty/malformed/missing sidecar in WARN log
7de0b821 fix(implementation): harden agent contract against tool-output bloat + shell-state corruption
ce051a45 Merge pull request #1291 from mongrel-intelligence/feat/sentry-issue-lifecycle-trigger
63ac36f5 fix(alerting): declare alerting:issue-lifecycle in alerting agent definition
7345285c Merge pull request #1290 from mongrel-intelligence/refactor/cli-command-factory
26acea97 Merge pull request #1289 from mongrel-intelligence/docs/trigger-architecture-contracts
cd7e8e5f feat(alerting): handle Sentry resource:issue webhooks (Internal Integration surface)
b8ca17c1 refactor(gadgets): split CLI command factory helpers
2949b246 docs(triggers): fix resolveTriggerResult usage and resilience doc boundary
c03c2c3a docs(resilience): qualify dispatch-failure compensation boundary accurately
cfbceef5 docs(triggers): qualify buildDeferredRecheckResult as GitHub-only in all active contributor docs
03035f8e docs(trigger): qualify deferred re-check as GitHub-only in architecture docs
5e0a2c9d docs(triggers): qualify deferred re-check as GitHub-only and fix router failure table
0c10ed6d docs(trigger): update trigger architecture contracts
746e450b test(triggers): expand trigger contract conformance coverage (#1287)

aaight and others added 17 commits May 9, 2026 14:00
* test(triggers): expand trigger contract conformance coverage

* fix(test): address review feedback on trigger conformance fixtures and guard

- Mirror production `allPassing` predicate in `createCheckSuiteStatus`:
  require at least one check and treat success/skipped/neutral as passing,
  matching `githubClient.getCheckSuiteStatus()` exactly. The neutral
  conformance case now correctly yields `allPassing: true`.

- Extend the raw-trigger-literal guard regex to catch backtick template
  literals in addition to single- and double-quoted strings, closing the
  syntax-level bypass identified in the review.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Cascade Bot <bot@cascade.dev>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…er failure table

- src/triggers/README.md: buildDeferredRecheckResult authoring guidance now
  explains that bare re-dispatch is a GitHub-only contract today. GitHub's
  buildJob() strips triggerResult and sets mergeabilityRecheckAttempt; non-GitHub
  adapters (Trello, JIRA, Linear, Sentry) always embed triggerResult, so their
  workers use resolveTriggerResult() with the pre-resolved result and skip
  registry dispatch. A non-GitHub handler returning buildDeferredRecheckResult
  would schedule a job that reuses the same agentType: null result.

- docs/architecture/02-webhook-pipeline.md: split the combined Redis failure row
  into two distinct rows. PM immediate-dispatch and coalesced-dispatch failures
  call onBlocked and return "Failed to enqueue/schedule job to Redis". Deferred
  re-check schedule failures capture Sentry under deferred_recheck_schedule_failure,
  skip onBlocked, and still return "Deferred re-check scheduled" (same as success).
  Updated the Coalescing and deferred re-check prose to match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…re docs

The "Deferred re-checks" section in 03-trigger-system.md and the
"Deferred re-check exhaustion" section in 10-resilience.md both
described worker re-dispatch as a generic contract, but only
GitHubRouterAdapter.buildJob() strips triggerResult and sets
mergeabilityRecheckAttempt. Non-GitHub adapters embed triggerResult so
resolveTriggerResult() short-circuits, and the registry is never
re-invoked. Updated both paragraphs to mirror the GitHub-only
qualification already present in src/triggers/README.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…all active contributor docs

The builder summary table, TriggerResult JSDoc, and CLAUDE.md all described
deferred re-check as a generic re-dispatch contract. In practice, bare
re-dispatch only works for GitHub: GitHubRouterAdapter.buildJob() strips
triggerResult and sets mergeabilityRecheckAttempt, while non-GitHub adapters
embed the pre-resolved result so resolveTriggerResult() short-circuits without
registry dispatch. Update all three surfaces to state this constraint clearly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rately

The paragraph added in this PR overstated the compensation invariant by
implying all pre-worker failures flow through BullMQ `failed` →
`releaseLocksForFailedJob`. That path only applies post-enqueue.

Redis enqueue/schedule failures (`addJob` / `scheduleCoalescedJob` throwing
before `markImmediateDispatchEnqueued` / `markCoalescedDispatchEnqueued`)
have no BullMQ job and therefore no `failed` event. Lock protection on
that path comes from the success-first ordering contract: marks happen
only after a successful enqueue, so a Redis failure leaves no lock to
release. `onBlocked()` is the inline compensation.

Split the single paragraph into two distinct boundaries — enqueue/schedule
failures and post-enqueue dispatch failures — consistent with the router
outcomes table in docs/architecture/02-webhook-pipeline.md.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ndary

- Update trigger-resolution.ts "Used By" column from "Sentry (GitHub and
  PM use inline logic)" to "GitHub, PM, Sentry" — all three worker
  handlers call resolveTriggerResult() (PM: webhook-handler.ts:69,
  GitHub: webhook-handler.ts:230, Sentry: webhook-handler.ts:53).
- Replace "[inline] if triggerResult → use it, else dispatchTrigger(...)"
  in the GitHub flow diagram with the actual call:
  resolveTriggerResult(registry, ctx, triggerResult) → result.
- Amend "Every failed dispatch path flows through the BullMQ failed event"
  to "Post-enqueue dispatch failures … flow through the BullMQ failed
  event" with a forward reference to the split below, removing the
  internal contradiction with the two-boundary section added earlier.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ration surface)

Closes the silent-skip path for Sentry's "Internal Integration" / Custom
Webhook surface. Prod 2026-05-09: a wedged-lock-canary alert fired in the
cascade Sentry project, the alerting agent was enabled, but no agent ran.
Webhook log id `fbdc6d87-b962-444c-8a2a-a9452a74ff71` shows
`processed=false, decisionReason="Event unparseable or not processable"` —
the trigger was never invoked.

Root cause: `src/router/adapters/sentry.ts:31` whitelisted only
`['event_alert', 'metric_alert']` (Sentry Alert Rule surfaces). The webhook
arrived with `Sentry-Hook-Resource: issue` (Internal Integration default
surface — the natural way users wire Sentry → cascade). Spec 019 was
scoped to event_alert; the issue-lifecycle path was deferred and never
landed. Users who configured Sentry the natural way got silent skips for
every issue.

This adds end-to-end support, mirroring the event_alert pattern:

- Router adapter accepts `'issue'` resource + new test asserting it parses.
- New `SentryIssueLifecycleTrigger` (matches `resource: 'issue'` +
  `action: 'created'`) fires the alerting agent. Resolved/archived/etc.
  actions are deferred (would auto-close the cascade card; out of scope).
- Distinct AlertSource literal `'sentry-issue'` so the
  `(project_id, external_source, external_id)` partial-unique index on
  `pr_work_items` doesn't collide if the same Sentry issue arrives via
  both surfaces (event_alert and issue) — each surface materializes its
  own card.
- New `formatSentryIssueLifecycleCardBody` builds AlertHints from
  `data.issue.{title, web_url, level, shortId, culprit, metadata.{filename, function}}`.
  Mirror of `formatSentryCardBody` adapted to the issue-lifecycle payload shape.
- Worker-side `processSentryWebhook` extends the materializer dispatch with
  a third branch keyed on `agentInput.triggerEvent === 'alerting:issue-lifecycle'`.
  Same AlertSlotMissingError graceful-skip + transient-PM-error retry semantics
  as the existing two branches.
- `SentryIssuePayload` type updated to match the actual Sentry webhook shape
  (nested `data.issue.{...}` instead of flattened `data.{...}` — the captured
  prod fixture confirmed the existing type was wrong).

Drive-by lint cleanup (per request): refactored `materializeAlertWorkItem`
to extract `reuseOrLazyHealMapping` and `pollForConcurrentWinner` helpers,
bringing the parent function under the cognitive-complexity ceiling.

Verification:
- 9112 unit tests passing (3 new test files: issue-lifecycle-format,
  issue-lifecycle handler, plus extensions to sentry-webhook-handler and
  the router/adapters/sentry tests). Captured live prod fixture used as
  the regression baseline.
- Lint clean (0 errors, 0 warnings).
- Typecheck clean.

Operator notes:
- Cascade project's PM `lists.alerts` (Trello) / `statuses.alerts`
  (JIRA, Linear) must be configured for materialisation to actually
  create a card. The pre-flight validation rule at
  `src/triggers/shared/integration-validation.ts` already emits a
  `pm`-category error when alerting is enabled but the slot is unset —
  unchanged; same message will fire for the new `'sentry-issue'` source.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…inition

Adds `alerting:issue-lifecycle` to `src/agents/definitions/alerting.yaml`
so `getResolvedTriggerConfig` can find the event. Without this entry the
lookup returned `null`, causing `checkTriggerEnabledWithParams` to always
see `enabled=false` — leaving the prod incident path effectively unfixed
even though the router now accepts `resource: issue` webhooks.

Also adds `TRIGGER_EVENTS.ALERTING.ISSUE_LIFECYCLE` to the events catalog
and updates the handler + webhook-handler to use the constant (required by
the trigger-event-string consistency static guard for new handlers).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…+ shell-state corruption

Closes the silent-failure path identified by prod run d8e31665 (cascade
implementation, 2026-05-09): the agent created PR #1290 successfully but the
run was marked failed with "Agent completed but no authoritative PR creation
was recorded" — and operators had zero signal of why. Three independent
buggy layers compounded; fix all three so any one being robust catches the
failure mode.

## Layer 1 — gadget output bloat (the trigger; cascade-specific)

Cascade's lefthook pre-push runs `npm run test:fast` (159 files / 2981 tests
/ ~34 s / ~97 KB of output). That output was captured into the gadget result's
`pushOutput` field by `cascade-tools scm create-pr`. Codex's tool-result
parser couldn't reliably extract the JSON envelope buried under that volume,
retried the call, and the resulting concurrent invocations raced against the
same sidecar path leaving prUrl missing.

Fix: truncate `pushOutput` and `commitOutput` at 4 KB at the gadget result
boundary. Full hook output already streams through `runCommand`'s heartbeat
into `LLMIST_LOG_FILE` for operator visibility — only the agent-visible
result-stream copy is capped. New head + truncation-marker + tail format
preserves both ends of the output for context.

## Layer 2 — sidecar contract gap (the fault; provider-agnostic)

`writePRSidecar` accepted `prUrl: string` but never validated. If the caller
passed undefined for any reason, `JSON.stringify({prUrl: undefined, ...})`
silently omitted the key and produced a half-baked sidecar. Downstream
hydration reported "missing required fields { hasPrUrl: false }" with no
signal of what the writer actually persisted.

Fix: validate at the write boundary. Refuse to write + log ERROR with full
input context when prUrl is missing/non-string or prNumber is non-finite.
Same hardening for writeReviewSidecar's reviewUrl. Caller bugs now produce
a precise log entry instead of a silent contract violation.

## Layer 3 — diagnostic gap (operator visibility)

`hydratePrSidecar` logged "PR sidecar missing required fields { hasPrUrl:
false }" with no further context — no path, no parsed keys, no actual
values. Operators hit Loki archaeology to determine whether the file was
empty, malformed, or a valid JSON without prUrl.

Fix: hydratePrSidecar now re-reads the raw payload and dumps `sidecarPath`,
`rawSidecarKeys`, `rawSidecarPrUrl`, `rawSidecarPrNumber` so a single log
line distinguishes empty-file / valid-JSON-without-prUrl / malformed-JSON.
postProcessResult emits a Sentry capture under tag `pr_sidecar_invalid`
(with engine + agentType + identifier extras) when authoritative PR
evidence is missing — gives ops a single dashboard for prod frequency.

## Layer 4 — codex shell-state corruption (the propagation; codex-specific)

Codex's persistent bash session breaks with `ERROR codex_core::tools::router:
error=write_stdin failed: stdin is closed for this session`. Once that
signal fires, every subsequent command in the run inherits a stale stdout
buffer (we observed lint output from one command bleeding into a later
post-comment call, plus sidecar writes racing). Codex itself exits cleanly
(exit=0) — the stderr signal is the only evidence the session corrupted.

Fix: detect the signal in stderr at run completion. When present + exit=0,
return `success: false` with explicit error "codex shell-state corrupted:
write_stdin failed (stdin closed for session)" + log ERROR. Ops retry
against a fresh codex session instead of trusting potentially-corrupted
state. Conservative: false-positive case (benign stderr) pinned by a
"passes through cleanly when stderr is benign" regression test.

## Tests

- `tests/unit/gadgets/github/core/createPR.test.ts` (+3 cases) — 97 KB
  pushOutput → ~4 KB with marker; 50 KB commitOutput same; small output
  unchanged.
- `tests/unit/gadgets/session/core/sidecar.test.ts` (+4 cases) — refuses
  empty prUrl; refuses non-string prUrl; refuses non-finite prNumber;
  refuses empty reviewUrl. Each pairs an ERROR log assertion with a
  no-write filesystem assertion.
- `tests/unit/backends/sidecarManager.test.ts` (+2 cases) — diagnostic
  WARN includes rawSidecarKeys + rawSidecarPrUrl for valid-JSON-without-prUrl
  case AND for empty-file case.
- `tests/unit/backends/postProcess.test.ts` (+2 cases) — Sentry capture
  fires under tag `pr_sidecar_invalid` when authoritative PR evidence is
  missing; does NOT fire on the happy path.
- `tests/unit/backends/codex.test.ts` (+2 cases) — codex shell-state signal
  in stderr → success=false + ERROR log; benign stderr does not regress.

## Verification

- `npm test` — 9141 / 9141 passing
- `npm run lint` — clean
- `npm run typecheck` — clean
- All 4 fixes are additive; no behavioral change on healthy paths

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`readRawPRSidecar` returned `null` for all three failure modes (missing
file, empty/whitespace file, malformed JSON), so every 'PR sidecar
missing required fields' WARN emitted the same `rawSidecarKeys: null`
shape — operators could not tell the cases apart without Loki archaeology.

Replace the nullable return with a discriminated union (`RawSidecarDiagnostic`)
with four statuses: 'missing', 'empty', 'malformed', 'parsed'. Each
case carries its own supplementary fields:
- 'empty': rawByteLength (confirms zero-byte truncation vs. missing)
- 'malformed': rawByteLength + parseError + rawPreview (shows how far the
  write got and what the JSON syntax error was)
- 'parsed': full key list + prUrl/prNumber values (the original Layer-3 signal)
- 'missing': status alone (file was never written)

Update the two existing regression tests to assert the new `rawSidecarStatus`
field. Add a third test for the truncated-mid-write (malformed) case to pin
that it is distinguishable from the empty-file case.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ailure

Extend the 4 KB truncation to the error paths of `stageAndCommit` and
`pushBranch`. Previously, only the success-path `commitOutput`/`pushOutput`
fields were capped; a pre-commit or pre-push hook failure that emits 97+ KB
of test output still embedded the full output in `err.message`, which
`createCLICommand` serialises into the JSON error envelope — leaving the
same Codex parser/retry bloat path open. Full hook output stays in the
worker engine log (LLMIST_LOG_FILE) for operator visibility.

Adds two regression tests pinning the failure-path truncation.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@zbigniewsobiecki zbigniewsobiecki merged commit 7369cf9 into main May 9, 2026
15 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants