clickhouse new syncs and verify-data#1304
Conversation
…c-session-replays
…ck-auth/stack-auth into clickhouse-sync-extra-tables
…invites' into clickhouse-sync-email-outbox-table # Conflicts: # apps/backend/scripts/clickhouse-migrations.ts # apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts # apps/backend/src/lib/external-db-sync.ts # apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-basics.test.ts # apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts
…le' into clickhouse-sync-session-replays
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✅ Files skipped from review due to trivial changes (1)
📝 WalkthroughWalkthroughAdds sequenceId and shouldUpdateSequenceId fields and indexes across many models; extends external-db-sync helpers and wiring to most write/delete paths; expands sequencer backfill/status and ClickHouse mappings/migrations; adds ClickHouse verification tooling, cron tweak, extensive E2E sync tests, and dashboard/status UI updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant API
participant Prisma
participant Sequencer
participant ExternalDB
participant ClickHouse
Client->>API: Create/Update entity (e.g., Team)
API->>Prisma: upsert data wrapped with withExternalDbSyncUpdate()
Prisma-->>API: row persisted (shouldUpdateSequenceId = true)
Note over Sequencer,Prisma: Backfill / Poll
Sequencer->>Prisma: SELECT ... FOR UPDATE SKIP LOCKED (shouldUpdateSequenceId = true)
Prisma-->>Sequencer: rows
Sequencer->>Prisma: UPDATE sequenceId = nextval(...), shouldUpdateSequenceId = false
Sequencer-->>Sequencer: enqueue tenant sync
Sequencer->>ExternalDB: apply upserts/deletes (Postgres external DB)
ExternalDB-->>Sequencer: ack
Sequencer->>ClickHouse: apply analytics upserts
ClickHouse-->>Sequencer: ack
sequenceDiagram
participant Client
participant API
participant Prisma
participant DeletedRow
participant Sequencer
participant ExternalDB
Client->>API: Delete entity (e.g., Team)
API->>Prisma: recordExternalDbSyncTeamPermissionDeletionsForTeam()
Prisma->>DeletedRow: INSERT deleted permission rows
API->>Prisma: recordExternalDbSyncTeamMemberDeletionsForTeam()
Prisma->>DeletedRow: INSERT deleted member rows
API->>Prisma: recordExternalDbSyncDeletion() for Team
Prisma->>DeletedRow: INSERT team deletion record
API->>Prisma: DELETE Team
Prisma-->>API: deletion committed
Sequencer->>Prisma: process DeletedRow entries, enqueue tenants
Sequencer->>ExternalDB: apply deletes to external DB
ExternalDB-->>Sequencer: ack
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ❌ 3❌ Failed checks (2 warnings, 1 inconclusive)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR adds a ClickHouse data-integrity verifier script that fetches rows from both Postgres and ClickHouse for every sync mapping and compares them row-by-row, and also fixes two sync-tracking gaps — refresh token updates in Key changes:
Confidence Score: 4/5Safe to merge with minor follow-ups; production fixes (tokens.tsx, crud.tsx) are correct and the verifier script logic is sound. All remaining findings are P2. The most actionable one is the implicit coupling between BATCH_LIMIT and the SQL LIMIT: if the SQL limit in db-sync-mappings.ts is ever reduced below 1000, the verifier will silently stop paginating early and miss discrepancies. The JSON string-equality comparison could also produce false-positive failures if key ordering diverges between the sync writer and the verifier reader. Neither is a current production bug, but both weaken the verifier's reliability as a safety net. apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts — BATCH_LIMIT coupling and JSON comparison robustness. Important Files Changed
Sequence DiagramsequenceDiagram
participant Script as verify-data-integrity/index.ts
participant Verifier as clickhouse-sync-verifier.ts
participant PG as Postgres (via Prisma)
participant CH as ClickHouse (default.* views)
Script->>Verifier: verifyClickhouseSync({ tenancy, projectId, branchId, recurse })
loop for each mapping in DEFAULT_DB_SYNC_MAPPINGS
Verifier->>Verifier: skip if no clickhouse fetchQuery
loop paginate (cursor = lastSequenceId, BATCH_LIMIT=1000)
Verifier->>PG: $queryRawUnsafe(fetchQuery, tenancyId, lastSequenceId)
PG-->>Verifier: batch of rows (LIMIT 1000)
Verifier->>Verifier: filter sync_is_deleted=true rows
Verifier->>Verifier: strip sync_* columns
Verifier->>Verifier: advance lastSequenceId to max in batch
end
Verifier->>CH: SELECT * FROM default.{table} WHERE project_id AND branch_id
CH-->>Verifier: all non-deleted rows (FINAL applied in view)
Verifier->>Verifier: assert pgRows.length == chRows.length
Verifier->>Verifier: sort both by [project_id, branch_id, ...sortKeys]
loop for each row pair
Verifier->>Verifier: normalizeRow (booleans, dates, JSON, bigints)
Verifier->>Verifier: deepEqual → throw StackAssertionError on mismatch
end
end
Prompt To Fix All With AIThis is a comment left during a code review.
Path: apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts
Line: 165-200
Comment:
**BATCH_LIMIT must stay in sync with SQL LIMIT**
`BATCH_LIMIT = 1000` is used to detect the last page of pagination via `if (batch.length < BATCH_LIMIT) break`. The SQL queries in `db-sync-mappings.ts` all use `LIMIT 1000`, so this works today. However, if the SQL LIMIT is ever reduced below `BATCH_LIMIT` (e.g., to 500), a full batch of 500 rows would satisfy `500 < 1000 → true` and stop pagination prematurely, silently skipping remaining rows. This would cause the verifier to miss discrepancies — the very thing it's supposed to catch.
Consider adding a comment to document this tight coupling, or better yet, export the limit constant from `db-sync-mappings.ts` and import it here so both sides stay in sync automatically:
```ts
// IMPORTANT: BATCH_LIMIT must equal the LIMIT clause in internalDbFetchQueries.clickhouse (currently 1000 in db-sync-mappings.ts).
// If the SQL LIMIT is reduced below this value, pagination will terminate early and silently miss rows.
const BATCH_LIMIT = 1000;
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts
Line: 12-25
Comment:
**Use Map instead of plain object for dynamic key access**
`SORT_KEYS` is accessed with dynamic keys (`mappingName` from `Object.entries(DEFAULT_DB_SYNC_MAPPINGS)`). Per the project's custom rule, plain objects with dynamic string keys should be replaced with `Map<string, string[]>` to avoid prototype pollution (e.g., if a future mapping name were `__proto__` or `constructor`).
```ts
const SORT_KEYS = new Map<string, string[]>([
["users", ["id"]],
["contact_channels", ["id"]],
["teams", ["id"]],
["team_member_profiles", ["team_id", "user_id"]],
["team_permissions", ["team_id", "user_id", "permission_id"]],
["team_invitations", ["id"]],
["email_outboxes", ["id"]],
["session_replays", ["id"]],
["project_permissions", ["user_id", "permission_id"]],
["notification_preferences", ["id"]],
["refresh_tokens", ["id"]],
["connected_accounts", ["id"]],
]);
```
Then access as `SORT_KEYS.get(mappingName)` instead of `SORT_KEYS[mappingName]`.
**Rule Used:** Use Map<A, B> instead of plain objects when using ... ([source](https://app.greptile.com/review/custom-context?memory=cd0e08f7-0df2-43c8-8c71-97091bba4120))
**Learnt From**
[stack-auth/stack-auth#769](https://github.com/stack-auth/stack-auth/pull/769)
[stack-auth/stack-auth#835](https://github.com/stack-auth/stack-auth/pull/835)
[stack-auth/stack-auth#839](https://github.com/stack-auth/stack-auth/pull/839)
[stack-auth/stack-auth#472](https://github.com/stack-auth/stack-auth/pull/472)
[stack-auth/stack-auth#720](https://github.com/stack-auth/stack-auth/pull/720)
[stack-auth/stack-auth#700](https://github.com/stack-auth/stack-auth/pull/700)
[stack-auth/stack-auth#813](https://github.com/stack-auth/stack-auth/pull/813)
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts
Line: 63-77
Comment:
**JSON comparison after normalization uses string equality**
After `normalizePostgresValue` and `normalizeClickhouseValue` are applied, `json` columns become raw strings (Postgres: `JSON.stringify(parsedJsonb)`; ClickHouse: the stored string, returned as-is by `JSONEachRow`). These are then compared in `deepEqual` via `===` (string equality).
Postgres's `jsonb` type stores keys in alphabetical order, so `JSON.stringify` of a Postgres-returned object consistently produces alphabetically sorted keys. As long as the sync code also writes ClickHouse by `JSON.stringify`-ing the same Postgres-sourced object, the strings will match. However, if any code path writes to ClickHouse with a different key ordering or whitespace (e.g., extra spaces, non-alphabetical keys), this will produce false-positive failures.
To be safe, consider parsing both JSON strings and performing a deep structural comparison rather than relying on string equality:
```ts
if (columnType === "json") {
try {
return JSON.stringify(JSON.parse(value as string));
} catch {
return value;
}
}
```
Apply this on the ClickHouse side to re-normalize the stored string before comparison.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "clickhouse verify data integrity" | Re-trigger Greptile |
N2D4
left a comment
There was a problem hiding this comment.
in the future we could also have a data integrity check that compares REST API output to clickhouse data, which would ensure that the sync SQL query itself is correct, but that's for later (it will require us to think a decent amount about what values it should actually have)
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
apps/backend/src/app/api/latest/auth/sessions/current/route.tsx (1)
34-48:⚠️ Potential issue | 🟠 MajorPotential race condition: deletion recorded before actual delete succeeds.
The
recordExternalDbSyncDeletionis called beforedeleteMany. IfdeleteManyfails (e.g., due to constraint violation or transient error), a deletion record will exist for a row that wasn't actually deleted, causing sync inconsistency.Consider wrapping both operations in a transaction, or recording the deletion only after confirming the delete succeeded.
🔧 Proposed fix using transaction
try { const prisma = await getPrismaClientForTenancy(tenancy); - await recordExternalDbSyncDeletion(globalPrismaClient, { - tableName: "ProjectUserRefreshToken", - tenancyId: tenancy.id, - refreshTokenId, - }); - const result = await globalPrismaClient.projectUserRefreshToken.deleteMany({ where: { tenancyId: tenancy.id, id: refreshTokenId, }, }); // If no records were deleted, throw the same error as before if (result.count === 0) { throw new KnownErrors.RefreshTokenNotFoundOrExpired(); } + + await recordExternalDbSyncDeletion(globalPrismaClient, { + tableName: "ProjectUserRefreshToken", + tenancyId: tenancy.id, + refreshTokenId, + }); } catch (e) {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/app/api/latest/auth/sessions/current/route.tsx` around lines 34 - 48, The current flow calls recordExternalDbSyncDeletion before the actual delete (globalPrismaClient.projectUserRefreshToken.deleteMany), risking a stale deletion record if the delete fails; change the logic so the delete is performed first and only on success record the external deletion, or wrap both actions in a single transaction to ensure atomicity — locate the code around getPrismaClientForTenancy, recordExternalDbSyncDeletion, and globalPrismaClient.projectUserRefreshToken.deleteMany and either (A) perform deleteMany, check result.count>0, then call recordExternalDbSyncDeletion, or (B) execute both recordExternalDbSyncDeletion and deleteMany inside a transactional block (using globalPrismaClient.$transaction or equivalent) so both succeed or both roll back.apps/backend/src/app/api/latest/auth/sessions/crud.tsx (1)
75-86:⚠️ Potential issue | 🟡 MinorSame ordering concern: deletion recorded before actual delete.
Similar to
sessions/current/route.tsx, the deletion is recorded before the actualdeleteMany. While the prior existence check (lines 59-68) reduces the risk, a concurrent deletion between the check and delete could still cause sync inconsistency.For consistency with the other file's fix, consider recording the deletion after confirming success.
🔧 Proposed fix
- await recordExternalDbSyncDeletion(globalPrismaClient, { - tableName: "ProjectUserRefreshToken", - tenancyId: auth.tenancy.id, - refreshTokenId: params.id, - }); - await globalPrismaClient.projectUserRefreshToken.deleteMany({ where: { tenancyId: auth.tenancy.id, id: params.id, }, }); + + await recordExternalDbSyncDeletion(globalPrismaClient, { + tableName: "ProjectUserRefreshToken", + tenancyId: auth.tenancy.id, + refreshTokenId: params.id, + }); },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/app/api/latest/auth/sessions/crud.tsx` around lines 75 - 86, The deletion is recorded via recordExternalDbSyncDeletion before the actual delete, risking inconsistent syncs on concurrent operations; change the order so you first perform globalPrismaClient.projectUserRefreshToken.deleteMany (capture its result, e.g., const result = await globalPrismaClient.projectUserRefreshToken.deleteMany({...})) and only if result.count > 0 call recordExternalDbSyncDeletion with the same payload (tableName "ProjectUserRefreshToken", tenancyId auth.tenancy.id, refreshTokenId params.id) so the external sync is recorded only when the DB deletion actually succeeded.apps/backend/src/oauth/model.tsx (1)
182-191:⚠️ Potential issue | 🟡 MinorInconsistent
withExternalDbSyncUpdateusage:createbranch not wrapped.The
updatepayload is wrapped withwithExternalDbSyncUpdate, but thecreatepayload is not. Thenotification-preference/crud.tsxupsert wraps both branches withwithExternalDbSyncUpdate, creating an inconsistency.Clarify whether new refresh token records should also trigger external DB sync marking. If yes, wrap the
createbranch for consistency with other upsert patterns in the codebase.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/oauth/model.tsx` around lines 182 - 191, The upsert call currently applies withExternalDbSyncUpdate to the update branch but not to the create branch, causing inconsistency with other upserts (e.g., notification-preference/crud.tsx); to fix, wrap the create payload in withExternalDbSyncUpdate as well so both branches mark external DB sync (i.e., change the create object passed to the upsert to be withExternalDbSyncUpdate({ refreshToken: token.refreshToken, tenancyId: tenancy.id, projectUserId: user.id }) while keeping the update branch unchanged), locating this change around the upsert call that references withExternalDbSyncUpdate in the same module.apps/backend/src/lib/permissions.tsx (1)
433-439:⚠️ Potential issue | 🟠 MajorMissing external DB sync deletion recording for team permissions.
When deleting a permission definition with team scope,
deleteManyis called directly without recording deletions viarecordExternalDbSyncDeletion. This is inconsistent with the project scope handling (lines 441-460) which properly records each deletion before callingdeleteMany.Team permission deletions won't be synced to external databases with the current implementation.
🐛 Proposed fix to add deletion recording for team scope
// Remove all direct permissions for this permission ID if (options.scope === "team") { + const teamPermissions = await sourceOfTruthTx.teamMemberDirectPermission.findMany({ + where: { + tenancyId: options.tenancy.id, + permissionId: options.permissionId, + }, + select: { id: true }, + }); + for (const perm of teamPermissions) { + await recordExternalDbSyncDeletion(sourceOfTruthTx, { + tableName: "TeamMemberDirectPermission", + tenancyId: options.tenancy.id, + permissionDbId: perm.id, + }); + } await sourceOfTruthTx.teamMemberDirectPermission.deleteMany({ where: { tenancyId: options.tenancy.id, permissionId: options.permissionId, }, }); } else {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/permissions.tsx` around lines 433 - 439, The team-scoped branch currently calls sourceOfTruthTx.teamMemberDirectPermission.deleteMany without recording deletions; mirror the project-scoped flow by first querying the matching teamMemberDirectPermission rows (by tenancyId and permissionId) and for each row call recordExternalDbSyncDeletion with the row id and appropriate metadata, then perform the deleteMany; update the options.scope === "team" block to use sourceOfTruthTx.teamMemberDirectPermission.findMany (or equivalent) to collect ids, call recordExternalDbSyncDeletion for each, and only afterwards call sourceOfTruthTx.teamMemberDirectPermission.deleteMany to remove them.
🧹 Nitpick comments (5)
apps/backend/scripts/run-cron-jobs.ts (1)
29-33: Consider adding backoff on error to avoid rapid retries during outages.The fixed 1-second interval applies regardless of whether the previous call succeeded or failed. If an endpoint experiences persistent failures (e.g., database down, network issues), this will cause rapid-fire retries every second, potentially overwhelming the system and flooding logs.
Consider increasing the wait time after errors:
♻️ Proposed fix with error backoff
const runResult = await Result.fromPromise(run(endpoint)); if (runResult.status === "error") { captureError("run-cron-jobs", runResult.error); + await wait(10_000); // Wait longer on error to avoid rapid retries + } else { + await wait(1000); } - await wait(1000);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/scripts/run-cron-jobs.ts` around lines 29 - 33, The loop always pauses a fixed 1s causing rapid retries on failures; modify the logic around runResult (from Result.fromPromise(run(endpoint))) so that when runResult.status === "error" you apply a backoff before the next iteration (e.g., exponential or capped linear backoff) and call captureError("run-cron-jobs", runResult.error) as today, but increase wait() based on a retry counter or timestamp; reset the backoff counter when run(endpoint) succeeds so successful runs continue at the normal interval, and enforce a configurable max backoff to avoid unbounded delays.apps/backend/src/app/api/latest/auth/sessions/current/route.tsx (1)
35-36: Unusedprismavariable.The
prismaclient is obtained but never used. Either remove this line or useprismainstead ofglobalPrismaClientfor the delete operation if tenancy-scoped client is intended.♻️ Proposed fix
try { - const prisma = await getPrismaClientForTenancy(tenancy); - await recordExternalDbSyncDeletion(globalPrismaClient, {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/app/api/latest/auth/sessions/current/route.tsx` around lines 35 - 36, The prisma client returned by getPrismaClientForTenancy(tenancy) is created but unused; either remove that call or switch the delete call to use the tenancy-scoped client. Update the code so the session deletion uses prisma (the variable returned from getPrismaClientForTenancy) instead of globalPrismaClient, or remove the prisma assignment if tenancy-scoped behavior is not needed; ensure references to globalPrismaClient are replaced with prisma in the current route handler (where the delete operation occurs) if you choose the tenancy-scoped fix.apps/backend/src/app/api/latest/session-replays/batch/route.tsx (1)
202-205: RedundantshouldUpdateSequenceIdupdate after chunk creation.The
prisma.sessionReplay.upsertat lines 114-130 already setsshouldUpdateSequenceId: truein both thecreateandupdatebranches. Since chunk creation doesn't modify theSessionReplayrow's synced fields, this additional update is unnecessary and adds an extra database round-trip on every non-deduped upload.🔧 Suggested removal
- await prisma.sessionReplay.update({ - where: { tenancyId_id: { tenancyId, id: replayId } }, - data: { shouldUpdateSequenceId: true }, - }); - return { statusCode: 200,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/app/api/latest/session-replays/batch/route.tsx` around lines 202 - 205, The explicit prisma.sessionReplay.update call that sets shouldUpdateSequenceId: true after chunk creation is redundant because the earlier prisma.sessionReplay.upsert (in the same flow) already sets shouldUpdateSequenceId: true in both create and update branches; remove the extra prisma.sessionReplay.update that uses where: { tenancyId_id: { tenancyId, id: replayId } } and data: { shouldUpdateSequenceId: true } to avoid the unnecessary additional DB round-trip after chunk creation.apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts (1)
189-214: Minor: CTE selects onlyidbut orders bytenancyId.The
rows_to_updateCTE orders bytenancyIdbut only selectsid. This works correctly for row locking, but includingtenancyIdin the SELECT would be more explicit about the ordering's purpose and match other queries in this file.♻️ Suggested change for consistency
WITH rows_to_update AS ( - SELECT "id" + SELECT "id", "tenancyId" FROM "TeamMemberDirectPermission" WHERE "shouldUpdateSequenceId" = TRUE ORDER BY "tenancyId"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts` around lines 189 - 214, The CTE rows_to_update currently SELECTs only "id" but orders by "tenancyId"; update the rows_to_update CTE in the globalPrismaClient.$queryRaw call so it also SELECTs "tenancyId" (e.g., SELECT "id", "tenancyId") to make the ORDER BY explicit and consistent with other queries; ensure the rest of the query (the updated_rows UPDATE and the final SELECT DISTINCT "tenancyId") continues to work unchanged and that the returned teamPermissionTenants variable and subsequent enqueueExternalDbSyncBatch call still map t => t.tenancyId correctly.apps/backend/src/app/api/latest/internal/external-db-sync/status/route.ts (1)
252-300: Consider documenting the mapping coverage.
buildMappingInternalStatscurrently only computes internal stats forusersandemail_outboxesmappings. The other sequenced tables (teams, team_members, etc.) have sequencer stats exposed but no corresponding mapping entries. If this is intentional (e.g., these tables are synced through different mechanisms or planned for future mapping expansion), a brief comment would help clarify.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/app/api/latest/internal/external-db-sync/status/route.ts` around lines 252 - 300, In buildMappingInternalStats, add a brief explanatory comment above the mappingInternalStats construction stating why only "users" and "email_outboxes" are included (e.g., other sequenced tables like teams/team_members are intentionally not mapped here because they are synced differently or planned for future mapping), or alternatively add the missing mapping entries if omission was accidental; reference the function buildMappingInternalStats, the mappingInternalStats map, and deletedRowsByTable/SequenceStats to locate where to place the comment or add mappings so reviewers understand intended coverage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts`:
- Around line 12-24: SORT_KEYS currently uses non-unique keys for some mappings
causing row misalignment; update the SORT_KEYS object so keys match the actual
unique columns returned by the sync queries: set "team_permissions" and
"project_permissions" to ["id"], set "notification_preferences" to a composite
unique key (e.g. ["user_id","notification_type"]) and set "connected_accounts"
to a composite unique key (e.g. ["provider","provider_account_id","user_id"]);
modify the SORT_KEYS constant accordingly so the verifier sorts and compares
rows using the true unique identifier columns for each mapping.
In `@apps/backend/src/lib/external-db-sync.ts`:
- Around line 1141-1147: The schema mapping for email_outboxes mistakenly
normalizes rendered_is_transactional instead of the ClickHouse-aliased
is_transactional; update the email_outboxes mapping in external-db-sync.ts to
use is_transactional (and keep the nullable_boolean normalizer) so the
nullable-boolean normalizer runs and Postgres true/false are compared correctly
against ClickHouse 1/0 (check references to rendered_is_transactional,
is_transactional, and the nullable_boolean normalizer to locate and update the
mapping).
In `@apps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.ts`:
- Around line 165-170: The catch block treating Postgres error '42P01'
(undefined table) as failure should instead treat it as success when the caller
expects the table to be absent; update the handler around client.query so that
when err.code === '42P01' you return true if shouldExist === false (and still
return/propagate false or rethrow for other cases), referencing the client.query
call and the shouldExist flag to locate and fix the logic.
In `@packages/stack-shared/src/config/db-sync-mappings.ts`:
- Around line 931-935: The deleted-rows SELECT is aliasing the permission column
as "id" which breaks row-shape consistency expected by pushRowsToExternalDb;
change the alias ("DeletedRow"."primaryKey"->>'permissionId')::text AS "id" to
AS "permission_id" (and update the other similar occurrences referenced around
lines 1614-1617) so the deleted row column names match the live branch (e.g.,
keep "team_id", "user_id", "permission_id", "created_at") and ensure any
consumers (pushRowsToExternalDb) receive uniform column names/order.
- Around line 347-360: The tombstone rows for deleted contact_channels/teams use
NULL::text for non-null ClickHouse String columns (see the SELECT emitting NULL
for "type" and "value" and similar NULL for "display_name"), causing inserts to
fail; update the mappings in db-sync-mappings.ts (the delete-row SELECT that
sets "sync_is_deleted" true and fields "type","value", and "display_name") to
emit a non-null tombstone string (e.g., empty string or a constant like
'DELETED') instead of NULL::text for those fields, and apply the same change to
the other delete-mapping block referenced (around the other occurrence at lines
543-555) so ClickHouse non-null String columns receive valid values.
---
Outside diff comments:
In `@apps/backend/src/app/api/latest/auth/sessions/crud.tsx`:
- Around line 75-86: The deletion is recorded via recordExternalDbSyncDeletion
before the actual delete, risking inconsistent syncs on concurrent operations;
change the order so you first perform
globalPrismaClient.projectUserRefreshToken.deleteMany (capture its result, e.g.,
const result = await
globalPrismaClient.projectUserRefreshToken.deleteMany({...})) and only if
result.count > 0 call recordExternalDbSyncDeletion with the same payload
(tableName "ProjectUserRefreshToken", tenancyId auth.tenancy.id, refreshTokenId
params.id) so the external sync is recorded only when the DB deletion actually
succeeded.
In `@apps/backend/src/app/api/latest/auth/sessions/current/route.tsx`:
- Around line 34-48: The current flow calls recordExternalDbSyncDeletion before
the actual delete (globalPrismaClient.projectUserRefreshToken.deleteMany),
risking a stale deletion record if the delete fails; change the logic so the
delete is performed first and only on success record the external deletion, or
wrap both actions in a single transaction to ensure atomicity — locate the code
around getPrismaClientForTenancy, recordExternalDbSyncDeletion, and
globalPrismaClient.projectUserRefreshToken.deleteMany and either (A) perform
deleteMany, check result.count>0, then call recordExternalDbSyncDeletion, or (B)
execute both recordExternalDbSyncDeletion and deleteMany inside a transactional
block (using globalPrismaClient.$transaction or equivalent) so both succeed or
both roll back.
In `@apps/backend/src/lib/permissions.tsx`:
- Around line 433-439: The team-scoped branch currently calls
sourceOfTruthTx.teamMemberDirectPermission.deleteMany without recording
deletions; mirror the project-scoped flow by first querying the matching
teamMemberDirectPermission rows (by tenancyId and permissionId) and for each row
call recordExternalDbSyncDeletion with the row id and appropriate metadata, then
perform the deleteMany; update the options.scope === "team" block to use
sourceOfTruthTx.teamMemberDirectPermission.findMany (or equivalent) to collect
ids, call recordExternalDbSyncDeletion for each, and only afterwards call
sourceOfTruthTx.teamMemberDirectPermission.deleteMany to remove them.
In `@apps/backend/src/oauth/model.tsx`:
- Around line 182-191: The upsert call currently applies
withExternalDbSyncUpdate to the update branch but not to the create branch,
causing inconsistency with other upserts (e.g.,
notification-preference/crud.tsx); to fix, wrap the create payload in
withExternalDbSyncUpdate as well so both branches mark external DB sync (i.e.,
change the create object passed to the upsert to be withExternalDbSyncUpdate({
refreshToken: token.refreshToken, tenancyId: tenancy.id, projectUserId: user.id
}) while keeping the update branch unchanged), locating this change around the
upsert call that references withExternalDbSyncUpdate in the same module.
---
Nitpick comments:
In `@apps/backend/scripts/run-cron-jobs.ts`:
- Around line 29-33: The loop always pauses a fixed 1s causing rapid retries on
failures; modify the logic around runResult (from
Result.fromPromise(run(endpoint))) so that when runResult.status === "error" you
apply a backoff before the next iteration (e.g., exponential or capped linear
backoff) and call captureError("run-cron-jobs", runResult.error) as today, but
increase wait() based on a retry counter or timestamp; reset the backoff counter
when run(endpoint) succeeds so successful runs continue at the normal interval,
and enforce a configurable max backoff to avoid unbounded delays.
In `@apps/backend/src/app/api/latest/auth/sessions/current/route.tsx`:
- Around line 35-36: The prisma client returned by
getPrismaClientForTenancy(tenancy) is created but unused; either remove that
call or switch the delete call to use the tenancy-scoped client. Update the code
so the session deletion uses prisma (the variable returned from
getPrismaClientForTenancy) instead of globalPrismaClient, or remove the prisma
assignment if tenancy-scoped behavior is not needed; ensure references to
globalPrismaClient are replaced with prisma in the current route handler (where
the delete operation occurs) if you choose the tenancy-scoped fix.
In
`@apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts`:
- Around line 189-214: The CTE rows_to_update currently SELECTs only "id" but
orders by "tenancyId"; update the rows_to_update CTE in the
globalPrismaClient.$queryRaw call so it also SELECTs "tenancyId" (e.g., SELECT
"id", "tenancyId") to make the ORDER BY explicit and consistent with other
queries; ensure the rest of the query (the updated_rows UPDATE and the final
SELECT DISTINCT "tenancyId") continues to work unchanged and that the returned
teamPermissionTenants variable and subsequent enqueueExternalDbSyncBatch call
still map t => t.tenancyId correctly.
In `@apps/backend/src/app/api/latest/internal/external-db-sync/status/route.ts`:
- Around line 252-300: In buildMappingInternalStats, add a brief explanatory
comment above the mappingInternalStats construction stating why only "users" and
"email_outboxes" are included (e.g., other sequenced tables like
teams/team_members are intentionally not mapped here because they are synced
differently or planned for future mapping), or alternatively add the missing
mapping entries if omission was accidental; reference the function
buildMappingInternalStats, the mappingInternalStats map, and
deletedRowsByTable/SequenceStats to locate where to place the comment or add
mappings so reviewers understand intended coverage.
In `@apps/backend/src/app/api/latest/session-replays/batch/route.tsx`:
- Around line 202-205: The explicit prisma.sessionReplay.update call that sets
shouldUpdateSequenceId: true after chunk creation is redundant because the
earlier prisma.sessionReplay.upsert (in the same flow) already sets
shouldUpdateSequenceId: true in both create and update branches; remove the
extra prisma.sessionReplay.update that uses where: { tenancyId_id: { tenancyId,
id: replayId } } and data: { shouldUpdateSequenceId: true } to avoid the
unnecessary additional DB round-trip after chunk creation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8967ebb2-a916-4506-a2ac-58df8d42c52a
📒 Files selected for processing (38)
.gitignoreapps/backend/prisma/migrations/20260316000000_add_team_team_member_sequence_columns/migration.sqlapps/backend/prisma/migrations/20260316000001_add_email_outbox_sequence_columns/migration.sqlapps/backend/prisma/migrations/20260316000002_add_session_replay_sequence_columns/migration.sqlapps/backend/prisma/migrations/20260317000000_add_team_permission_invitation_sequence_columns/migration.sqlapps/backend/prisma/migrations/20260317000001_add_project_permission_notification_preference_sequence_columns/migration.sqlapps/backend/prisma/migrations/20260318000000_add_sequence_id_to_refresh_tokens_and_oauth_accounts/migration.sqlapps/backend/prisma/migrations/20260318000001_add_sequence_indexes_concurrently/migration.sqlapps/backend/prisma/schema.prismaapps/backend/scripts/clickhouse-migrations.tsapps/backend/scripts/run-cron-jobs.tsapps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.tsapps/backend/scripts/verify-data-integrity/index.tsapps/backend/src/app/api/latest/auth/password/update/route.tsxapps/backend/src/app/api/latest/auth/sessions/crud.tsxapps/backend/src/app/api/latest/auth/sessions/current/route.tsxapps/backend/src/app/api/latest/emails/notification-preference/crud.tsxapps/backend/src/app/api/latest/emails/outbox/crud.tsxapps/backend/src/app/api/latest/emails/unsubscribe-link/route.tsxapps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.tsapps/backend/src/app/api/latest/internal/external-db-sync/status/route.tsapps/backend/src/app/api/latest/oauth-providers/crud.tsxapps/backend/src/app/api/latest/session-replays/batch/route.tsxapps/backend/src/app/api/latest/team-member-profiles/crud.tsxapps/backend/src/app/api/latest/team-memberships/crud.tsxapps/backend/src/app/api/latest/teams/crud.tsxapps/backend/src/app/api/latest/users/crud.tsxapps/backend/src/lib/email-queue-step.tsxapps/backend/src/lib/external-db-sync.tsapps/backend/src/lib/permissions.tsxapps/backend/src/lib/tokens.tsxapps/backend/src/oauth/model.tsxapps/backend/src/route-handlers/verification-code-handler.tsxapps/dashboard/src/app/(main)/(protected)/projects/[projectId]/external-db-sync/page-client.tsxapps/e2e/tests/backend/endpoints/api/v1/analytics-query.test.tsapps/e2e/tests/backend/endpoints/api/v1/external-db-sync-basics.test.tsapps/e2e/tests/backend/endpoints/api/v1/external-db-sync-utils.tspackages/stack-shared/src/config/db-sync-mappings.ts
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (3)
apps/backend/src/lib/external-db-sync.ts (2)
1187-1189: Assert the derived target table instead of using!.The invariant here is "parsed table name is a non-empty string", not just "not undefined". Reusing
assertNonEmptyStringmakes that assumption explicit and keeps this path aligned with the rest of the file's validation style.🛠️ Minimal fix
- const targetTable = tableName.includes('.') ? tableName.split('.').pop()! : tableName; + const targetTable = tableName.includes(".") ? tableName.split(".").pop() : tableName; + assertNonEmptyString(targetTable, `ClickHouse target table derived from ${JSON.stringify(tableName)}`); const normalizers = CLICKHOUSE_COLUMN_NORMALIZERS[targetTable] ?? {};As per coding guidelines "Validate all assumptions through type system (preferred), assertions, or tests; ideally two out of three".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/external-db-sync.ts` around lines 1187 - 1189, The code derives targetTable with tableName.split('.').pop()! which uses a non-null assertion; replace that with an explicit assertion by calling assertNonEmptyString on the derived value to ensure the invariant "parsed table name is a non-empty string"; locate the snippet that creates targetTable and change it to compute the value then call assertNonEmptyString(targetTable, 'targetTable') (or equivalent) before using it with CLICKHOUSE_COLUMN_NORMALIZERS to match the file's validation style.
1107-1167: Make the normalizer table keyed by the real target-table union.The write path looks up
CLICKHOUSE_COLUMN_NORMALIZERSwithmapping.targetTable, while the verifier currently looks it up with the mapping key. Keeping this asRecord<string, ...>hides that coupling and lets misspelled table names silently opt out of normalization. Typing it asPartial<Record<MappingTargetTable, ...>>would force both call sites onto the same key shape, includingapps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts:160.As per coding guidelines "Validate all assumptions through type system (preferred), assertions, or tests; ideally two out of three".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/lib/external-db-sync.ts` around lines 1107 - 1167, The CLICKHOUSE_COLUMN_NORMALIZERS map is typed too loosely as Record<string,...>, so misspelled or mismatched mapping keys can bypass normalization; change its type to Partial<Record<MappingTargetTable, Record<string,'json'|'boolean'|'nullable_boolean'|'bigint'>>> (use the MappingTargetTable union) and update any callers to use the canonical mapping.targetTable key when looking up normalizers (not the mapping object key) so both the write path and the verifier use the same key shape; run the TypeScript build to fix any call-site errors and adjust code to satisfy the stricter type (e.g., change verifier lookup to CLICKHOUSE_COLUMN_NORMALIZERS[mapping.targetTable]).apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts (1)
152-153: Missing fetch queries shouldn't silently skip a mapping.A missing
internalDbFetchQueries.clickhousecurrently turns into a quietreturn, which makes the verifier look green even though that mapping was never checked.🛠️ Fail loudly instead
const fetchQuery = mapping.internalDbFetchQueries.clickhouse; - if (!fetchQuery) return; + if (fetchQuery == null) { + throw new StackAssertionError(`Missing ClickHouse fetch query for mapping ${mappingName}.`); + }As per coding guidelines "Fail early, fail loud; fail fast with an error instead of silently continuing" and "Prefer explicit null/undefinedness checks over boolean checks (e.g.,
foo == nullinstead of!foo)".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts` around lines 152 - 153, The code silently skips mappings when mapping.internalDbFetchQueries.clickhouse is missing; change the check to an explicit null/undefined check (use fetchQuery == null) and throw a descriptive Error instead of returning so the verifier fails loudly (include mapping identifier in the message). Locate the fetchQuery assignment and the conditional around mapping.internalDbFetchQueries.clickhouse, replace the early return with throwing an Error that names the mapping (e.g., mapping.name or mapping.id) and mentions the missing internalDbFetchQueries.clickhouse entry.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts`:
- Around line 40-42: The code currently stringifies JSON columns in the
columnType === "json" branch which prevents stripNullsAndEmpties() and
deepEqual() from normalizing structural differences; instead, return a
structured JS value so the scrub can operate: if the incoming value is a string
try JSON.parse and return the resulting object, if it's already an object return
it unchanged, and only fallback to stringifying on parse error; ensure the same
logic is applied wherever columnType === "json" is handled (the other occurrence
noted) so stripNullsAndEmpties() and deepEqual() run on actual objects rather
than flattened strings.
- Around line 28-35: The compareRows function currently collapses missing sort
keys to "" which masks schema drift; instead, in compareRows(validate that for
each key in sortKeys both a and b actually have the key (use
Object.prototype.hasOwnProperty.call or key in ...) and if a key is missing on
either row throw a descriptive Error naming the missing key and which row (a or
b) so the verifier fails fast; after validating presence, continue with the
string comparison logic (convert values to strings and compare) as before.
- Around line 62-75: The verifier's normalizeClickhouseValue currently only
handles JSON and dates; update it to also handle the types declared in
CLICKHOUSE_COLUMN_NORMALIZERS by mirroring the write-path normalization in
external-db-sync.ts (lines ~1234-1245): for "boolean" convert ClickHouse
boolean-like values ("0"/"1", "false"/"true", 0/1) to true/false; for
"nullable_boolean" treat "null" or null/undefined as null and otherwise convert
boolean-like values to true/false; for "bigint" normalize numeric or string
integer values to the same numeric representation used by the write path (e.g.,
Number(value) or BigInt if the write path uses BigInt). Implement these branches
inside normalizeClickhouseValue (use the columnType parameter) so
ClickHouse-side normalization matches the Postgres-side normalization logic.
---
Nitpick comments:
In `@apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts`:
- Around line 152-153: The code silently skips mappings when
mapping.internalDbFetchQueries.clickhouse is missing; change the check to an
explicit null/undefined check (use fetchQuery == null) and throw a descriptive
Error instead of returning so the verifier fails loudly (include mapping
identifier in the message). Locate the fetchQuery assignment and the conditional
around mapping.internalDbFetchQueries.clickhouse, replace the early return with
throwing an Error that names the mapping (e.g., mapping.name or mapping.id) and
mentions the missing internalDbFetchQueries.clickhouse entry.
In `@apps/backend/src/lib/external-db-sync.ts`:
- Around line 1187-1189: The code derives targetTable with
tableName.split('.').pop()! which uses a non-null assertion; replace that with
an explicit assertion by calling assertNonEmptyString on the derived value to
ensure the invariant "parsed table name is a non-empty string"; locate the
snippet that creates targetTable and change it to compute the value then call
assertNonEmptyString(targetTable, 'targetTable') (or equivalent) before using it
with CLICKHOUSE_COLUMN_NORMALIZERS to match the file's validation style.
- Around line 1107-1167: The CLICKHOUSE_COLUMN_NORMALIZERS map is typed too
loosely as Record<string,...>, so misspelled or mismatched mapping keys can
bypass normalization; change its type to Partial<Record<MappingTargetTable,
Record<string,'json'|'boolean'|'nullable_boolean'|'bigint'>>> (use the
MappingTargetTable union) and update any callers to use the canonical
mapping.targetTable key when looking up normalizers (not the mapping object key)
so both the write path and the verifier use the same key shape; run the
TypeScript build to fix any call-site errors and adjust code to satisfy the
stricter type (e.g., change verifier lookup to
CLICKHOUSE_COLUMN_NORMALIZERS[mapping.targetTable]).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 92420d89-684a-486b-835b-e8090d52b59c
📒 Files selected for processing (2)
apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.tsapps/backend/src/lib/external-db-sync.ts
- Narrow team invitation cascade to specific changed teamId instead of entire tenancy - Populate status dashboard stats for all 11 sync mappings, not just users and email_outboxes - Wrap all OAuth provider updates with withExternalDbSyncUpdate so allowSignIn and allowConnectedAccounts changes trigger sync - Remove session_replays from sequencer and column type map since no mapping exists yet - Add DeletedRow union to email_outboxes fetch queries for future safety
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (3)
packages/stack-shared/src/config/db-sync-mappings.ts (3)
1708-1711:⚠️ Potential issue | 🔴 CriticalColumn alias mismatch between live and deleted branches.
Same issue as
team_permissions: the deleted branch aliasespermissionIdas"id"(line 1710), but the live branch at line 1698 uses"permission_id".,
🛠️ Suggested fix
SELECT ("DeletedRow"."primaryKey"->>'projectUserId')::uuid AS "user_id", - "DeletedRow"."primaryKey"->>'permissionId' AS "id", + "DeletedRow"."primaryKey"->>'permissionId' AS "permission_id", "DeletedRow"."deletedAt"::timestamp without time zone AS "created_at",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/stack-shared/src/config/db-sync-mappings.ts` around lines 1708 - 1711, The deleted-branch SELECT uses ("DeletedRow"."primaryKey"->>'permissionId') AS "id" which mismatches the live branch that aliases this column as "permission_id"; update the deleted-branch alias to "permission_id" so both branches use the same column name (adjust the expression in the DeletedRow primaryKey extraction to alias permissionId as "permission_id" instead of "id").
931-935:⚠️ Potential issue | 🔴 CriticalColumn alias mismatch between live and deleted branches.
The deleted branch aliases
permissionIdas"id"(line 934), but the live branch at line 921 aliases it as"permission_id". This inconsistency will cause row shape mismatches when processing mixed live+deleted batches.,
🛠️ Suggested fix
SELECT ("DeletedRow"."primaryKey"->>'teamId')::uuid AS "team_id", ("DeletedRow"."primaryKey"->>'projectUserId')::uuid AS "user_id", - "DeletedRow"."primaryKey"->>'permissionId' AS "id", + "DeletedRow"."primaryKey"->>'permissionId' AS "permission_id", "DeletedRow"."deletedAt"::timestamp without time zone AS "created_at",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/stack-shared/src/config/db-sync-mappings.ts` around lines 931 - 935, The deleted-branch SELECT is aliasing ("DeletedRow"."primaryKey"->>'permissionId') as "id" which mismatches the live branch's "permission_id" shape; update the deleted branch to alias permissionId as "permission_id" (i.e., change the alias for ("DeletedRow"."primaryKey"->>'permissionId') to "permission_id") so the DeletedRow output shape matches the live branch's output and mixed batches process consistently.
2178-2187:⚠️ Potential issue | 🟠 MajorPotential NULL values for non-nullable ClickHouse columns in tombstone rows.
The deleted branch reads
providerandprovider_account_idfrom"DeletedRow"."data"(lines 2182-2183). If thedataJSON doesn't contain these fields, the values will be NULL, but the ClickHouse schema defines them as non-nullableStringcolumns (lines 2136-2137).Compare with
team_invitations(lines 1064-1066) which correctly uses empty string/nil UUID literals for tombstone values.🛠️ Suggested fix using COALESCE for safe fallbacks
SELECT "Tenancy"."projectId" AS "project_id", "Tenancy"."branchId" AS "branch_id", ("DeletedRow"."data"->>'projectUserId')::uuid AS "user_id", - "DeletedRow"."data"->>'configOAuthProviderId' AS "provider", - "DeletedRow"."data"->>'providerAccountId' AS "provider_account_id", + COALESCE("DeletedRow"."data"->>'configOAuthProviderId', '') AS "provider", + COALESCE("DeletedRow"."data"->>'providerAccountId', '') AS "provider_account_id", "DeletedRow"."deletedAt"::timestamp without time zone AS "created_at",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@packages/stack-shared/src/config/db-sync-mappings.ts` around lines 2178 - 2187, The tombstone SELECT reads nullable JSON fields into non-nullable ClickHouse columns; update the expressions to provide safe fallbacks using COALESCE so ClickHouse doesn't get NULLs: replace ("DeletedRow"."data"->>'projectUserId')::uuid with a COALESCE fallback to the nil UUID literal, and wrap "DeletedRow"."data"->>'configOAuthProviderId' and "DeletedRow"."data"->>'providerAccountId' with COALESCE(..., '') so "provider" and "provider_account_id" emit empty strings instead of NULLs; adjust these three expressions in the SELECT targeting the DeletedRow JSON fields.
🧹 Nitpick comments (3)
apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts (3)
190-208: Minor inconsistency: ORDER BY column not in SELECT list.The
rows_to_updateCTE selects only"id"butORDER BYreferences"tenancyId". While this works in PostgreSQL, it's inconsistent with other backfill queries in this file (e.g., lines 57-61) where theORDER BYcolumn is included in the SELECT list.♻️ Optional fix for consistency
WITH rows_to_update AS ( - SELECT "id" + SELECT "tenancyId", "id" FROM "TeamMemberDirectPermission" WHERE "shouldUpdateSequenceId" = TRUE ORDER BY "tenancyId"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts` around lines 190 - 208, The rows_to_update CTE used in the query assigned to teamPermissionTenants references "tenancyId" in ORDER BY but only selects "id", causing an inconsistency; update the SQL (the query inside route.ts that builds teamPermissionTenants) so the rows_to_update CTE also selects "tenancyId" (e.g., SELECT "id", "tenancyId") to match the ORDER BY and mirror the pattern used elsewhere in this file.
278-296: Same inconsistency: ORDER BY column not in SELECT list.Same pattern as
TeamMemberDirectPermission—consider including"tenancyId"in the SELECT for consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts` around lines 278 - 296, The CTE rows_to_update in the globalPrismaClient.$queryRaw query for ProjectUserDirectPermission orders by "tenancyId" but doesn't select it, which can be inconsistent; modify the rows_to_update CTE used in the query (the SELECT inside rows_to_update) to include "tenancyId" alongside "id" so ORDER BY "tenancyId" is valid and consistent with the TeamMemberDirectPermission pattern, keeping the rest of the UPDATE and SELECT DISTINCT "tenancyId" logic unchanged.
418-418: Consider adding a legend or using clearer abbreviations.The log line uses terse abbreviations (e.g.,
TM,TMB,TP) that may be unclear to operators not familiar with the codebase. For debugging/operational clarity, consider either using slightly more descriptive abbreviations or adding a comment above with the legend.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts` at line 418, The console.log line in route.ts that prints backfilled counts uses terse abbreviations; update the log to use clearer labels or add a one-line legend comment above it so operators can map abbreviations to meanings. Locate the console.log that references projectUserTenants, contactChannelTenants, teamTenants, teamMemberTenants, teamPermissionTenants, teamInvitationTenants, emailOutboxTenants, projectPermissionTenants, notificationPreferenceTenants, refreshTokenTenants, oauthAccountTenants, and deletedRowTenants and either replace the short codes (e.g., TM, TMB, TP) with descriptive keys (e.g., Team, TeamMember, TeamPermission) or add a comment immediately above explaining each abbreviation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts`:
- Around line 142-158: The cascade UPDATE uses only tenancyId from teamTenants
and thus marks VerificationCode rows for all teams in those tenancies; change
the subquery filter to restrict to the specific teamIds in this batch by using
teamTenants' teamId (and if needed pair tenancyId+teamId) so only rows where
Team.teamId IN (the batch teamIds) and/or the pair (Tenancy.projectId,
Tenancy.branchId, Team.teamId) match are selected; update the
globalPrismaClient.$executeRaw block that builds the subquery (references:
teamTenants, globalPrismaClient.$executeRaw, "Team", "Tenancy",
"VerificationCode") to include the teamId filter from teamTenants.
In `@packages/stack-shared/src/config/db-sync-mappings.ts`:
- Around line 1823-1825: The deleted-branch mapping incorrectly casts
("DeletedRow"."data"->>'notificationCategoryId') to ::uuid for the
"notification_category_id" column; remove the ::uuid cast and return the raw
string (or cast to text/string) to match the ClickHouse schema and the
live-branch behavior—update the expression that produces
"notification_category_id" so it does not perform a UUID cast.
- Around line 1786-1799: The verifier config for notification_preferences
incorrectly uses SORT_KEYS: ["id"]; update the ClickHouse verifier
(clickhouse-sync-verifier.ts) so SORT_KEYS matches the table ORDER BY columns —
i.e., set SORT_KEYS to
["project_id","branch_id","user_id","notification_category_id"] (or the
equivalent constant used in that file) and remove the non-existent "id"
reference to align with the notification_preferences table schema.
- Around line 1993-2004: The SELECT for "user_id" can produce NULL or fail on
::uuid if "DeletedRow"."data" lacks projectUserId; update the "user_id"
expression in the SELECT to safely coalesce and cast, e.g. derive the value from
("DeletedRow"."data"->>'projectUserId') with a fallback to
("DeletedRow"."primaryKey"->>'projectUserId') and guard the cast with
NULLIF/CASE so an empty/missing string doesn't attempt ::uuid; adjust the
"user_id" field generation accordingly to ensure a non-null/valid UUID or a
controlled NULL per schema expectations.
- Around line 1859-1861: The Postgres fetch query in the internalDbFetchQuery
deleted branch incorrectly casts
("DeletedRow"."data"->>'notificationCategoryId') to ::uuid; update the mapping
that produces "notification_category_id" to cast it to ::text (or remove the
uuid cast) to match the Postgres schema; locate the deleted-branch SQL in the
internalDbFetchQuery mapping where
("DeletedRow"."data"->>'notificationCategoryId')::uuid AS
"notification_category_id" and change the cast to ::text so the returned column
type aligns with the schema.
---
Duplicate comments:
In `@packages/stack-shared/src/config/db-sync-mappings.ts`:
- Around line 1708-1711: The deleted-branch SELECT uses
("DeletedRow"."primaryKey"->>'permissionId') AS "id" which mismatches the live
branch that aliases this column as "permission_id"; update the deleted-branch
alias to "permission_id" so both branches use the same column name (adjust the
expression in the DeletedRow primaryKey extraction to alias permissionId as
"permission_id" instead of "id").
- Around line 931-935: The deleted-branch SELECT is aliasing
("DeletedRow"."primaryKey"->>'permissionId') as "id" which mismatches the live
branch's "permission_id" shape; update the deleted branch to alias permissionId
as "permission_id" (i.e., change the alias for
("DeletedRow"."primaryKey"->>'permissionId') to "permission_id") so the
DeletedRow output shape matches the live branch's output and mixed batches
process consistently.
- Around line 2178-2187: The tombstone SELECT reads nullable JSON fields into
non-nullable ClickHouse columns; update the expressions to provide safe
fallbacks using COALESCE so ClickHouse doesn't get NULLs: replace
("DeletedRow"."data"->>'projectUserId')::uuid with a COALESCE fallback to the
nil UUID literal, and wrap "DeletedRow"."data"->>'configOAuthProviderId' and
"DeletedRow"."data"->>'providerAccountId' with COALESCE(..., '') so "provider"
and "provider_account_id" emit empty strings instead of NULLs; adjust these
three expressions in the SELECT targeting the DeletedRow JSON fields.
---
Nitpick comments:
In
`@apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.ts`:
- Around line 190-208: The rows_to_update CTE used in the query assigned to
teamPermissionTenants references "tenancyId" in ORDER BY but only selects "id",
causing an inconsistency; update the SQL (the query inside route.ts that builds
teamPermissionTenants) so the rows_to_update CTE also selects "tenancyId" (e.g.,
SELECT "id", "tenancyId") to match the ORDER BY and mirror the pattern used
elsewhere in this file.
- Around line 278-296: The CTE rows_to_update in the
globalPrismaClient.$queryRaw query for ProjectUserDirectPermission orders by
"tenancyId" but doesn't select it, which can be inconsistent; modify the
rows_to_update CTE used in the query (the SELECT inside rows_to_update) to
include "tenancyId" alongside "id" so ORDER BY "tenancyId" is valid and
consistent with the TeamMemberDirectPermission pattern, keeping the rest of the
UPDATE and SELECT DISTINCT "tenancyId" logic unchanged.
- Line 418: The console.log line in route.ts that prints backfilled counts uses
terse abbreviations; update the log to use clearer labels or add a one-line
legend comment above it so operators can map abbreviations to meanings. Locate
the console.log that references projectUserTenants, contactChannelTenants,
teamTenants, teamMemberTenants, teamPermissionTenants, teamInvitationTenants,
emailOutboxTenants, projectPermissionTenants, notificationPreferenceTenants,
refreshTokenTenants, oauthAccountTenants, and deletedRowTenants and either
replace the short codes (e.g., TM, TMB, TP) with descriptive keys (e.g., Team,
TeamMember, TeamPermission) or add a comment immediately above explaining each
abbreviation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 34b4d6da-ca8f-4c87-919b-193ea3c57028
📒 Files selected for processing (5)
apps/backend/src/app/api/latest/internal/external-db-sync/sequencer/route.tsapps/backend/src/app/api/latest/internal/external-db-sync/status/route.tsapps/backend/src/app/api/latest/oauth-providers/crud.tsxapps/backend/src/lib/external-db-sync.tspackages/stack-shared/src/config/db-sync-mappings.ts
✅ Files skipped from review due to trivial changes (2)
- apps/backend/src/app/api/latest/internal/external-db-sync/status/route.ts
- apps/backend/src/lib/external-db-sync.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/backend/src/app/api/latest/oauth-providers/crud.tsx
Summary by CodeRabbit
New Features
Improvements