feat(alerting): materialize Sentry alerts as real PM work items (spec 019) by zbigniewsobiecki · Pull Request #1264 · mongrel-intelligence/cascade

zbigniewsobiecki · 2026-05-06T20:43:43Z

Summary

Root cause fixed: SentryIssueAlertTrigger was minting a synthetic sentry:issue:<id> workItemId that the Trello budget gate tried to use as a card ID → Trello returned 400 → worker died before persisting any agent_runs row. Zero alerting runs had ever been recorded for the cascade project despite 4+ successful webhook deliveries.
Real PM cards now created: A generic materializeAlertWorkItem(source, externalId, project, hints) materializer creates a Trello card / JIRA issue / Linear issue in the project's configured alerts slot. The PM-native card ID is returned as workItemId, so budget tracking, lifecycle transitions, and label writes all work correctly.
Idempotent via DB index: A partial UNIQUE index on (project_id, external_source, external_id) in work_items ensures the same Sentry issue always produces the same card on re-fire (lazy-heal on PM 404).
Validation pre-flight: validateIntegrations now checks that the alerts slot is configured when an alerting trigger is enabled; misconfigurations produce a pm-category error before the agent runs.
Wizard surfaces the slot: All three PM wizards (Trello, JIRA, Linear) now show an Alerts row in the Status Mapping step and a Cascade Alert label slot.

Plans

019/1 schema-and-pm-config — work_items.external_source / external_id columns + migration + alerts slot in PM config schemas + getAlertsContainerId / getAlertsStatusKey helpers
019/2 materializer-core — materializeAlertWorkItem, per-PM adapters (createAlertWorkItem), formatSentryCardBody format helper, AlertSlotMissingError
019/3 wire-sentry-trigger-and-validation — SentryIssueAlertTrigger calls materializer; validateIntegrations gains the alerts-slot check; agent-execution + manual-runner pass project to validation
019/4 wizard-validation-ui-and-docs — wizard alerts slots, README materializer section, spec 018 forward-link, CHANGELOG, repo-wide sentry:issue: regression pin

Test plan

npm test passes (490 test files, 8895 tests)
npm run lint clean
npm run typecheck clean
tests/unit/triggers/sentry/alerting-issue-materializer.test.ts (8 tests) — trigger calls materializer, AlertSlotMissingError handled correctly
tests/unit/triggers/shared/integration-validation-alerts-slot.test.ts (7 tests) — alerts-slot validation rule
tests/unit/triggers/sentry-alerting.test.ts — workItemId is PM-native, not synthetic prefix
tests/unit/no-synthetic-sentry-issue-id.test.ts — repo-wide grep: no sentry:issue: in src/
tests/unit/web/wizard-alerts-slot.test.ts (15 tests) — wizard slot arrays + round-trips

🤖 Generated with Claude Code

…t helper

… + validation - SentryIssueAlertTrigger calls materializeAlertWorkItem instead of minting the synthetic sentry:issue: prefix; AlertSlotMissingError → null+WARN, transient errors re-throw for BullMQ retry - Sets coalesceKey on TriggerResult so PM_COALESCE_WINDOW_MS applies - validateIntegrations accepts optional project param; checks alerts slot is configured when any alerting trigger is enabled - Both call sites (agent-execution, manual-runner) pass project to validator - Static grep test: no sentry:issue: literal in src/triggers/sentry/ - 50 new tests across alerting-issue-materializer, integration-validation-alerts-slot, no-synthetic-sentry-id, and updated sentry-alerting Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…on source changes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ssion pin - Add `alerts` status-mapping slot to Trello/JIRA/Linear PM wizard definitions (TRELLO_LIST_SLOTS, JIRA_STATUS_SLOTS, LINEAR_STATUS_SLOTS) - Add `cascade-alert` / `cascadeAlert` label-mapping slot to all three providers with canonical defaults (orange for Trello, #F97316 for Linear) - Export slot arrays for testability; 15 unit tests in wizard-alerts-slot.test.ts - Repo-wide regression pin: no-synthetic-sentry-issue-id.test.ts walks src/ and asserts no file contains the `sentry:issue:` literal - src/integrations/README.md: new "Alerting work-item materializer" section - docs/specs/018 forward-link to spec 019 (synthetic-id supersession) - CHANGELOG.md entry for spec 019 - Spec 019 marked .done; all 4 plans complete AC #3 (dynamic validation banner) deferred: requires a new tRPC validation endpoint + jsdom test infrastructure neither of which exists today. Backend enforcement lives in validateAlertsSlot (plan 3). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

nhopeatall · 2026-05-06T20:47:00Z

⚠️ review agent failed

Your access token could not be refreshed because your refresh token was already used. Please log out and sign in again.

_{Manual intervention may be required.}

codecov · 2026-05-06T20:48:32Z

Codecov Report

❌ Patch coverage is 82.23062% with 94 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/router/adapters/sentry.ts	22.50%	31 Missing ⚠️
src/integrations/alerting/_shared/format.ts	42.30%	30 Missing ⚠️
src/pm/config.ts	73.21%	15 Missing ⚠️
src/integrations/alerting/_shared/materialize.ts	93.49%	8 Missing ⚠️
src/db/repositories/prWorkItemsRepository.ts	91.04%	6 Missing ⚠️
src/router/worker-env.ts	33.33%	2 Missing ⚠️
src/integrations/alerting/_shared/types.ts	94.11%	1 Missing ⚠️
src/triggers/shared/integration-validation.ts	95.83%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

nhopeatall

Summary

Requesting changes because Sentry alert materialization runs in the router without PM credential/retry guarantees, and the DB idempotency row can be permanently wedged on PM failures.

Code Issues

Blocking

src/triggers/sentry/alerting-issue.ts:81 — The new materializer performs PM writes during router-side trigger dispatch, but the Sentry adapter does not establish PM credential scope and router dispatch exceptions are swallowed before enqueue, so production alerts can be acknowledged as processed without any durable worker job or retry.
src/integrations/alerting/_shared/materialize.ts:91 — The external mapping claim is inserted before PM side effects complete; any error before attachWorkItemId leaves a null work_item_id row that every later attempt treats as an in-progress winner forever.
src/integrations/alerting/_shared/materialize.ts:149 — Concurrent lazy-heal can return a replacement card that lost the DB compare-and-swap, so the run proceeds against an ID that is not the persisted mapping.
src/pm/config.ts:117 — JIRA/Linear missing statuses.alerts is not enforced before creation, so misconfigured projects can still get alert issues in the default/backlog state before validation aborts.

🕵️ codex · gpt-5.5 · run details

nhopeatall · 2026-05-06T20:57:34Z

+
+		let workItemId: string;
+		try {
+			workItemId = await materializeAlertWorkItem('sentry', issueId, ctx.project, hints);


This PM write now runs during router-side trigger dispatch. The Sentry router adapter currently calls triggerRegistry.dispatch(ctx) without withPMCredentials / withPMProvider scope, while the PM clients require those AsyncLocalStorage credentials. A Trello/JIRA/Linear create therefore throws before any job is enqueued; processRouterWebhook catches dispatch failures as non-fatal and returns "No trigger matched", so there is no worker run and no BullMQ retry. Please either move materialization into a durable worker/credential-scoped phase or make the Sentry dispatch path establish PM credentials and propagate failures into a retryable path.

Fixed. SentryRouterAdapter.dispatchWithCredentials now resolves PM credentials based on fullProject.pm?.type and wraps dispatch with the appropriate credential scope (withTrelloCredentials / withJiraCredentials / withLinearCredentials) plus withPMScopeForDispatch — mirroring the pattern from TrelloRouterAdapter, JiraRouterAdapter, and LinearRouterAdapter.

Without this, PM clients throw "No Xxx credentials in scope" inside materializeAlertWorkItem, the exception propagates to processRouterWebhook's try/catch where it is logged as non-fatal, and the alert is silently dropped with no BullMQ job or retry path.

For projects with no PM integration configured, dispatch falls through without credential scope (the trigger handler catches AlertSlotMissingError before any PM write).

nhopeatall · 2026-05-06T20:57:34Z

+	// Lost to a concurrent winner — poll for its work_item_id
+	if (claim.existing.workItemId) return claim.existing.workItemId;
+
+	for (let attempt = 0; attempt < POLL_MAX_ATTEMPTS; attempt++) {


After this PR inserts the external mapping, any error before attachWorkItemId leaves a permanent row with work_item_id = NULL. On the next Sentry delivery/retry, claimExternalMapping loses to that row, this polling branch waits for a winner that no longer exists, and every later attempt exhausts the same way. This happens if createWorkItem, addLabel, or moveWorkItem throws. The claim needs cleanup/reclaim semantics, or the native PM id needs to be attached before optional post-create operations can fail.

Fixed with two complementary changes:

1. Attach PM id before optional operations — attachWorkItemId (and replaceWorkItemId for the lazy-heal path) is now called immediately after createWorkItem returns, before addLabel and moveWorkItem. A failure in either optional operation no longer leaves the row with work_item_id = NULL; future retries find the row via findByExternal → existing.workItemId → getWorkItem (alive) → return without polling.

2. Cleanup on createWorkItem failure — A new deleteExternalMappingClaim(rowId) repository function (guarded by isNull(work_item_id)) is called in a catch wrapper around createAndAttach for freshly-claimed rows. If the PM card creation itself fails, the NULL claim row is deleted so the next Sentry delivery can reclaim it via claimExternalMapping rather than polling until MaterializationRetryExhausted. The isNull guard makes this a no-op if another process has already filled in the PM id.

nhopeatall · 2026-05-06T20:57:34Z

+		await attachWorkItemId(opts.rowId, newCard.id);
+	}
+
+	return newCard.id;


replaceWorkItemId can return false when another concurrent webhook has already lazy-healed the stale row, but this still returns the just-created newCard.id. That creates a duplicate PM card and runs this dispatch against an ID that is not stored in the external mapping; subsequent alerts resolve to the other card. If the compare-and-swap loses, re-read the mapping and return the stored work item instead of the unpersisted replacement.

Fixed. When replaceWorkItemId returns false (another concurrent webhook already healed the stale row), createAndAttach now calls findByExternal to re-read the canonical mapping and returns current.workItemId instead of newCard.id. The just-created card becomes an orphan (logged as WARN [alert-materializer] lazy-heal CAS lost, returning canonical id), but this dispatch proceeds against the ID that is actually stored in the external mapping — preventing the duplicate-card + ghost-dispatch scenario.

nhopeatall · 2026-05-06T20:57:34Z

+		return getTrelloConfig(project)?.lists?.alerts;
+	}
+	if (pmType === 'jira') {
+		return getJiraConfig(project)?.projectKey;


For JIRA, this returns projectKey even when statuses.alerts is unset; Linear does the same with teamId below. Since materializeAlertWorkItem only treats a missing container as AlertSlotMissingError, misconfigured JIRA/Linear projects still create an issue in the project/team and only later fail validation, leaving an alert work item outside the required alerts slot. The materializer should require the alerts destination before creating the PM item.

Fixed. getAlertsContainerId for JIRA and Linear now checks statuses.alerts before returning the container id:

JIRA: returns projectKey only when jiraConfig.statuses.alerts is set; otherwise returns undefined → AlertSlotMissingError before any PM write.

Linear: same guard — returns teamId only when linearConfig.statuses.alerts is set.

Trello behaviour is unchanged (the container IS the alerts list, already gated on lists.alerts). Tests for both new cases added to tests/unit/pm/config-alert-accessors.test.ts.

…n reliability - Establish PM credential scope in SentryRouterAdapter.dispatchWithCredentials so that materializeAlertWorkItem can call PM APIs (createWorkItem, addLabel, moveWorkItem) without "No Xxx credentials in scope" errors being swallowed as non-fatal by processRouterWebhook. Mirrors TrelloRouterAdapter / JiraRouterAdapter / LinearRouterAdapter patterns. - Prevent permanently-wedged NULL work_item_id rows: add deleteExternalMappingClaim (with isNull guard) to prWorkItemsRepository. Call it in materializeAlertWorkItem when createAndAttach fails for a fresh claim, so the next Sentry delivery can reclaim the row instead of polling to MaterializationRetryExhausted. - Move attachWorkItemId / replaceWorkItemId before optional post-create operations (addLabel, moveWorkItem) so a label/move failure never leaves a NULL-work_item_id row. Future retries find the row via findByExternal → existing.workItemId → getWorkItem (alive) → happy path. - Fix lazy-heal CAS lost: when replaceWorkItemId returns false (concurrent webhook already healed), re-read the canonical mapping and return the persisted work item id instead of the orphan card that lost the compare-and-swap. - Require statuses.alerts in getAlertsContainerId for JIRA and Linear: previously projectKey / teamId were returned even without statuses.alerts, allowing materializeAlertWorkItem to create an issue in the default/backlog state and only fail validation after the fact. Now AlertSlotMissingError is thrown before any PM write. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

zbigniewsobiecki · 2026-05-07T06:13:15Z

@aaight address code review concerns

nhopeatall

Summary

Requesting changes: the new alert materialization path can still silently drop Sentry alerts on PM failures, and the alerts-slot validation can block unrelated agents.

Code Issues

Blocking

src/triggers/sentry/alerting-issue.ts:81 — materializeAlertWorkItem runs during router-side trigger dispatch, before a BullMQ job exists. Non-AlertSlotMissingError PM failures are re-thrown, but processRouterWebhook catches dispatch errors as non-fatal and returns "No trigger matched", so Sentry receives 200 and there is no BullMQ retry.
src/integrations/alerting/_shared/materialize.ts:65 — once the mapping has a work_item_id, later deliveries only verify the PM item exists. If the required post-create placement (moveWorkItem for JIRA/Linear) fails after attachWorkItemId, retries permanently return the misplaced item and never repair the alerts slot.
src/triggers/shared/integration-validation.ts:207 — passing project makes the alerts-slot check run for every agent type. A project with an enabled alerting trigger but missing alerts slot will fail implementation/review/manual runs, while the Sentry webhook path still skips before this worker validation runs.

🕵️ codex · gpt-5.5 · run details

nhopeatall · 2026-05-07T06:14:43Z

+
+		let workItemId: string;
+		try {
+			workItemId = await materializeAlertWorkItem('sentry', issueId, ctx.project, hints);


This PM write still happens during router-side dispatch, before any BullMQ job has been enqueued. If materializeAlertWorkItem throws for a transient PM failure (for example Trello/JIRA/Linear 503, the polling retry exhaustion, or a DB error), this re-throw goes back through SentryRouterAdapter.dispatchWithCredentials into processRouterWebhook, whose dispatch catch logs it as non-fatal and then returns No trigger matched for event. The webhook handler then returns 200, so Sentry will not retry and BullMQ never sees a job. The unit test says re-throwing engages the BullMQ retry budget, but that only applies after enqueue; this path is still pre-enqueue.

Fixed. PM card materialisation is now deferred from the router-side trigger handler to the worker-side processSentryWebhook, inside the withPMScope block.

Why this was the problem: processRouterWebhook wraps the entire dispatch in a try/catch that logs exceptions as non-fatal and returns "No trigger matched". A PM failure inside SentryRouterAdapter.dispatchWithCredentials would therefore produce an HTTP 200 to Sentry with no BullMQ job ever enqueued — the alert is silently dropped.

The fix: SentryIssueAlertTrigger.handle now only:

Checks the trigger is enabled via checkTriggerEnabledWithParams

Pre-flight checks getAlertsContainerId(ctx.project) (returns null if alerts slot is unconfigured → returns null early, no dispatch)

Returns a TriggerResult with alertIssueId in agentInput but no workItemId — no PM write happens here

processSentryWebhook then materialises the PM card inside the withPMScope(pc.project, async () => { ... }) block. Any transient PM failure (5xx, DB error, polling exhaustion) propagates out of the withPMScope callback, through withAgentTypeConcurrency, and into BullMQ's retry budget (4 attempts, exponential backoff). AlertSlotMissingError is caught separately and causes a graceful skip with a WARN log — not a retry.

The coalesceKey now uses ${projectId}:sentry:${issueId} so BullMQ dedup works without needing a PM card ID at dispatch time.

nhopeatall · 2026-05-07T06:14:43Z

+
+	// Step 1: check for an existing mapping
+	const existing = await findByExternal(project.id, source, externalId);
+	if (existing?.workItemId) {


This existing-mapping fast path only verifies that the PM item still exists. Because createAndAttach now persists work_item_id before addLabel / moveWorkItem, a failure in moveWorkItem after attach leaves a permanent mapping to a card/issue that may not be in the configured alerts slot. The next Sentry delivery hits this branch, returns the ID, and never retries the required placement. For JIRA/Linear the follow-up move is what puts the issue in statuses.alerts, so this can violate the core materialization contract after any transient move failure.

Fixed. The fast path now re-applies moveWorkItem after verifying the PM card is alive.

The problem: After createAndAttach calls attachWorkItemId (PM id persisted) then moveWorkItem (optional), a transient failure in moveWorkItem leaves a permanent mapping to a correctly-created but misplaced card. Every subsequent Sentry delivery for the same issue hits the fast path, finds existing.workItemId, and returns it — never retrying the placement.

The fix: The fast path now does:

// Verify card exists try { await provider.getWorkItem(existing.workItemId); } catch (err) { if (!is404Error(err)) throw err; return await createAndAttach(..., { lazyHeal: { rowId: existing.id, ... } }); } // Re-apply placement to repair any prior moveWorkItem failure const destination = getAlertsStatusDestination(project); if (destination) { await provider.moveWorkItem(existing.workItemId, destination); } return existing.workItemId;

moveWorkItem is idempotent for all three PM providers (Trello: card is already in the alerts list; JIRA/Linear: status transitions are idempotent). Any transient error in moveWorkItem propagates out of the worker handler and into BullMQ's retry budget, so the placement is eventually repaired.

nhopeatall · 2026-05-07T06:14:43Z

 	const errors = results.filter((e): e is ValidationError => e !== null);

+	// Additional PM-config check: alerts slot required when alerting trigger is enabled
+	if (project) {


This extra validation now runs for every validateIntegrations(project.id, agentType, project) call, not just alerting runs. That means enabling an alerting trigger without configuring alerts will fail unrelated implementation/review/manual agents at worker preflight. It also does not protect the Sentry webhook path, because AlertSlotMissingError is caught in the router-side trigger and returns null before a worker ever calls this validation. This should be scoped to the alerting agent/path or moved to trigger configuration validation instead of project-wide agent validation.

Fixed. The validateAlertsSlot check is now scoped to agentType === 'alerting' only.

The problem: The original code ran validateAlertsSlot for every validateIntegrations(..., agentType, project) call. If a project had an alerting trigger enabled but no alerts slot configured, every unrelated implementation/review/manual agent run would fail pre-flight with a pm-category error — even though those agents have no dependency on the alerts slot.

The fix:

// Only run for the alerting agent — irrelevant to implementation/review/manual agents if (project && agentType === 'alerting') { const alertsSlotError = await validateAlertsSlot(projectId, project); if (alertsSlotError) errors.push(alertsSlotError); }

A test was added to integration-validation-alerts-slot.test.ts verifying that mockGetTriggerConfigsByProjectAndAgent is not called when agentType === 'implementation', even when the project has a configured alerting trigger and no alerts slot.

aaight · 2026-05-07T06:24:18Z

Addressed the code review concerns and pushed 8877c734 to dev.

The review feedback fixes already in the branch cover:

Sentry dispatch PM credential/provider scope
cleanup/reclaim for failed alert materialization claims
canonical mapping reread on lazy-heal CAS loss
requiring JIRA/Linear statuses.alerts before PM issue creation

I also fixed the session-finish CLI unit test isolation so the full unit suite passes with the PR-session CASCADE_PR_BRANCH env present.

Verification:

npm run lint
npm run typecheck
npm test (490 files, 8900 passed, 23 skipped)
push verified against origin/dev

🕵️ codex · gpt-5.5 · run details

…n reliability Three blocking issues from PR review: 1. Deferred PM card materialization from router-side trigger dispatch to worker-side processSentryWebhook. processRouterWebhook catches all dispatch exceptions as non-fatal (returns HTTP 200 to Sentry with no BullMQ job). Moving materializeAlertWorkItem into the withPMScope block in the worker means transient PM failures (5xx, DB errors, polling exhaustion) surface as BullMQ retries instead of being silently swallowed. The trigger now only pre-flight checks getAlertsContainerId and passes alertIssueId in agentInput; coalesceKey uses projectId:sentry:issueId for BullMQ dedup without needing a PM card ID. 2. Fixed fast path in materializeAlertWorkItem to re-apply moveWorkItem after verifying the PM card exists. Previously, if moveWorkItem failed during the initial createAndAttach call, subsequent deliveries for the same Sentry issue would find the existing mapping and return the misplaced work item without repairing its placement. Now the fast path re-runs moveWorkItem on every hit, which is idempotent for all three PM providers. 3. Scoped validateAlertsSlot check in integration-validation.ts to only run when agentType === 'alerting'. Previously it ran for every agent type, blocking implementation/review/manual agents on projects where the alerts slot was unconfigured but those agents had nothing to do with alerting. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

… webhook-handler branches Adds 5 tests to sentry-webhook-handler.test.ts covering the new worker-side PM card materialisation block (happy path, workItemId already set, no alertIssueId, AlertSlotMissingError skip, transient error rethrow). Adds 8 tests to prWorkItemsRepository.test.ts for the four new alert-mapping repo methods (findByExternal, claimExternalMapping, attachWorkItemId, replaceWorkItemId). Fixes four tests that reassigned chain properties after mock capture (using mockResolvedValueOnce instead to mutate the already-wired references). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

nhopeatall

Summary

Requesting changes because metric alerts still bypass the new PM materializer, so the original synthetic-work-item failure remains for alerting:metric-alert.

Code Issues

Blocking

src/triggers/sentry/webhook-handler.ts:90 / src/triggers/sentry/alerting-metric.ts:78 — processSentryWebhook only materializes when alertIssueId exists and workItemId is absent. SentryMetricAlertTrigger still returns workItemId = sentry:metric:${org}:${alertTitle}. For a project with a PM cost custom field and the metric-alert trigger enabled, runAgentExecutionPipeline receives that synthetic ID, checkPreRunBudget calls the PM provider with it, and the provider tries to read/update a nonexistent PM card or issue. That reproduces the failure mode this PR is meant to remove instead of creating a PM-native alert work item.

Should Fix

src/triggers/sentry/alerting-issue.ts:98 — Issue alerts now omit workItemId until the worker materializes the PM card, so the router never runs the work-item concurrency lock for them. After the coalesce window, a repeated Sentry issue alert can enqueue another alerting job while the first worker is still active; with the default no maxConcurrency setting, both workers materialize to the same PM card and run the same agent concurrently. That breaks the existing same-type work-item duplicate protection for this alert path.

🕵️ codex · gpt-5.5 · run details

nhopeatall · 2026-05-07T07:25:08Z

+				// alerts slot is configured; actual PM card creation happens here where the
+				// error propagates to BullMQ's retry budget.
+				let resolvedResult = result;
+				if (!result.workItemId && typeof result.agentInput.alertIssueId === 'string') {


This gate only materializes issue alerts. Metric alerts still arrive with workItemId already set to sentry:metric:... from SentryMetricAlertTrigger, so they skip this block and run the budget/lifecycle pipeline against a non-PM ID. A Trello project with customFields.cost and alerting:metric-alert enabled will still hit the original invalid-card failure instead of creating an alerts-slot card.

Fixed. SentryMetricAlertTrigger no longer sets a synthetic workItemId. Instead it now follows the same deferred-materialisation pattern as issue alerts:

Sets alertMetricKey = \${org}:${alertTitle}`inagentInput` (stable key for idempotent PM lookup)

Sets lockKey = \sentry-metric:${alertMetricKey}`` for router-level concurrency locking

Sets coalesceKey = \${projectId}:sentry-metric:${alertMetricKey}`` for BullMQ dedup

Returns no workItemId — no PM write on the router side

processSentryWebhook now handles both paths:

const alertIssueId = typeof result.agentInput.alertIssueId === 'string' ? result.agentInput.alertIssueId : null; const alertMetricKey = typeof result.agentInput.alertMetricKey === 'string' ? result.agentInput.alertMetricKey : null; if (!result.workItemId && (alertIssueId || alertMetricKey)) { if (alertIssueId) { hints = formatSentryCardBody(payload); workItemId = await materializeAlertWorkItem('sentry', alertIssueId, pc.project, hints); } else { hints = formatSentryMetricCardBody(payload); workItemId = await materializeAlertWorkItem('sentry-metric', alertMetricKey!, pc.project, hints); } }

AlertSource is extended with 'sentry-metric' and a formatSentryMetricCardBody() helper is added alongside formatSentryCardBody.

nhopeatall · 2026-05-07T07:25:08Z

-				// a queryable handle and groups multiple investigations of the same
-				// Sentry issue. Spec 018, AC #12.
-				workItemId,
+				// workItemId is intentionally absent here — the worker (processSentryWebhook)


Deferring every issue alert ID until worker execution also means the router cannot apply the existing (projectId, workItemId, agentType) lock. A second delivery for the same Sentry issue after the coalesce window can enqueue while the first alerting worker is active, then both workers materialize to the same PM card and run concurrently unless the project has an agent-type maxConcurrency override.

Fixed. SentryIssueAlertTrigger now sets lockKey = \sentry:${issueId}`inTriggerResult`, giving the router a stable key to lock on even before the PM card is materialised on the worker.

webhook-processor.ts uses effectiveLockKey = result.lockKey ?? result.workItemId throughout:

Lock check: isWorkItemLocked(project.id, effectiveLockKey, result.agentType)

After enqueue: markWorkItemEnqueued(project.id, effectiveLockKey, result.agentType)

Supersede/coalesce paths: same key

extractWorkItemId() in worker-env.ts returns triggerResult.lockKey ?? triggerResult.workItemId for Sentry jobs, so releaseLocksForFailedJob (the dispatch compensator) clears the correct key if the job eventually fails.

This means a second delivery for the same Sentry issue after the coalesce window will hit isWorkItemLocked and be rejected with Work item locked — same behaviour as Trello/JIRA/Linear work-item locks — rather than allowing concurrent workers to both materialise to the same PM card.

@nhopeatall

…er-level concurrency lock - SentryMetricAlertTrigger no longer sets a synthetic workItemId; instead it stores alertMetricKey in agentInput and a lockKey for router locking, mirroring the deferred-materialisation pattern from issue alerts. - processSentryWebhook now handles both alertIssueId (event alerts) and alertMetricKey (metric alerts), calling formatSentryMetricCardBody and materializeAlertWorkItem('sentry-metric', ...) for the metric path. - SentryIssueAlertTrigger now sets lockKey = sentry:<issueId> so the router's work-item concurrency lock fires even before the PM card is materialised on the worker, preventing concurrent duplicate runs. - webhook-processor uses effectiveLockKey = result.lockKey ?? result.workItemId throughout; worker-env extractWorkItemId returns lockKey ?? workItemId for Sentry jobs so releaseLocksForFailedJob clears the correct key. Addresses review feedback on PR #1264 (@nhopeatall): - Blocking: metric alerts bypassed materializer with synthetic workItemId - Should Fix: issue alerts had no router-level concurrency lock Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

aaight · 2026-05-07T07:45:47Z

Review Feedback Addressed

Changes Made

Blocking — Metric alerts bypassed PM materializer with synthetic workItemId

SentryMetricAlertTrigger no longer sets workItemId = sentry:metric:${org}:${alertTitle}. Instead it sets alertMetricKey in agentInput (a stable org:alertTitle string) and lockKey = sentry-metric:${alertMetricKey}.
Extended AlertSource type with 'sentry-metric' and added formatSentryMetricCardBody() format helper.
processSentryWebhook now handles both alertIssueId (issue alerts) and alertMetricKey (metric alerts): the materialization block branches on whichever key is present and calls materializeAlertWorkItem('sentry-metric', alertMetricKey, ...) for metric alerts.

Should Fix — Issue alerts omitted workItemId so the router couldn't lock on them

SentryIssueAlertTrigger now sets lockKey = sentry:${issueId} in TriggerResult.
webhook-processor.ts uses effectiveLockKey = result.lockKey ?? result.workItemId for all work-item concurrency lock operations (check, mark, clear). This fires the (projectId, lockKey, agentType) lock even before the PM card is materialised on the worker.
extractWorkItemId() in worker-env.ts returns lockKey ?? workItemId for Sentry jobs, so releaseLocksForFailedJob clears the correct key if dispatch fails.

Verification

All 490 test files pass (8927 tests passing, 23 skipped)
Lint clean (2 pre-existing complexity warnings, no errors)
TypeScript type check clean

🕵️ claude-code · claude-sonnet-4-6 · run details

zbigniewsobiecki and others added 11 commits May 6, 2026 19:54

chore(plan): spec 019 + plans created

a3d577c

chore(plan): lock 019/1 schema-and-pm-config

0123c0e

feat(alerting): plan 019/1 done — external source columns + alerts slot

84b0810

chore(plan): lock 019/2 materializer-core

b15c2a6

feat(alerting): plan 019/2 done — materializer + repo methods + forma…

039f735

…t helper

chore(plan): lock 019/3 wire-sentry-trigger-and-validation

04c2aeb

feat(alerting): plan 019/3 implementation — alerting-issue + validati…

1ade888

…on source changes Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chore(plan): 019/3 done — materializer wired + validation

7607a5f

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chore(plan): lock 019/4 wizard-validation-ui-and-docs

ebefce6

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

zbigniewsobiecki requested a review from aaight May 6, 2026 20:46

zbigniewsobiecki requested a review from nhopeatall May 6, 2026 20:49

nhopeatall requested changes May 6, 2026

View reviewed changes

aaight temporarily deployed to CI May 6, 2026 21:16 — with GitHub Actions Inactive

nhopeatall requested changes May 7, 2026

View reviewed changes

fix: address feedback

8877c73

aaight temporarily deployed to CI May 7, 2026 06:23 — with GitHub Actions Inactive

aaight temporarily deployed to CI May 7, 2026 06:40 — with GitHub Actions Inactive

zbigniewsobiecki temporarily deployed to CI May 7, 2026 07:12 — with GitHub Actions Inactive

zbigniewsobiecki requested review from aaight and removed request for aaight May 7, 2026 07:16

nhopeatall requested changes May 7, 2026

View reviewed changes

aaight temporarily deployed to CI May 7, 2026 07:49 — with GitHub Actions Inactive

zbigniewsobiecki merged commit 6562bc7 into main May 7, 2026
15 checks passed

Conversation

zbigniewsobiecki commented May 6, 2026

Summary

Plans

Test plan

Uh oh!

nhopeatall commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

nhopeatall left a comment

Choose a reason for hiding this comment

Summary

Code Issues

Blocking

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zbigniewsobiecki commented May 7, 2026

Uh oh!

nhopeatall left a comment

Choose a reason for hiding this comment

Summary

Code Issues

Blocking

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aaight commented May 7, 2026

Uh oh!

nhopeatall left a comment

Choose a reason for hiding this comment

Summary

Code Issues

Blocking

Should Fix

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aaight commented May 7, 2026

Review Feedback Addressed

Changes Made

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

nhopeatall commented May 6, 2026 •

edited

Loading

codecov Bot commented May 6, 2026 •

edited

Loading