Skip to content

Fix cron jobs using dev env instead of test env in CI workflows#1319

Merged
BilalG1 merged 2 commits into
devfrom
fix/cron-jobs-test-env
Apr 10, 2026
Merged

Fix cron jobs using dev env instead of test env in CI workflows#1319
BilalG1 merged 2 commits into
devfrom
fix/cron-jobs-test-env

Conversation

@BilalG1
Copy link
Copy Markdown
Collaborator

@BilalG1 BilalG1 commented Apr 10, 2026

The custom-base-port and db-migration-backwards-compatibility workflows were running cron jobs with with-env:dev instead of with-env:test, causing ClickHouse sync mismatches in verify-data-integrity.

Summary by CodeRabbit

  • Chores
    • Streamlined CI test workflows to standardize background cron job startup for more consistent test runs.
  • Tests
    • Improved end-to-end test reliability by aligning background process behavior across suites.
  • Bug Fixes
    • Enhanced data verification reliability by ensuring external database sync before integrity checks and tightening comparison ordering for certain records, reducing false mismatch detections.

The custom-base-port and db-migration-backwards-compatibility workflows
were running cron jobs with `with-env:dev` instead of `with-env:test`,
causing ClickHouse sync mismatches in verify-data-integrity.
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
stack-auth-hosted-components Ready Ready Preview, Comment Apr 10, 2026 3:01am
stack-backend Ready Ready Preview, Comment Apr 10, 2026 3:01am
stack-dashboard Ready Ready Preview, Comment Apr 10, 2026 3:01am
stack-demo Ready Ready Preview, Comment Apr 10, 2026 3:01am
stack-docs Ready Ready Preview, Comment Apr 10, 2026 3:01am
stack-preview-backend Ready Ready Preview, Comment Apr 10, 2026 3:01am
stack-preview-dashboard Ready Ready Preview, Comment Apr 10, 2026 3:01am

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 10, 2026

📝 Walkthrough

Walkthrough

Added a pre-verification sync call and adjusted CI cron job startup commands: workflows now run run-cron-jobs:test in background; verify script now calls syncExternalDatabases(tenancy) before running ClickHouse sync verification.

Changes

Cohort / File(s) Summary
CI Workflow Cron Job Command Updates
​.github/workflows/db-migration-backwards-compatibility.yaml, ​.github/workflows/e2e-custom-base-port-api-tests.yaml
Replaced background invocation of the dev script (with-env:dev tsx scripts/run-cron-jobs.ts) with the test-targeted command run-cron-jobs:test for starting cron jobs in the background.
ClickHouse verifier key change
apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts
Adjusted SORT_KEYS for notification_preferences from ["id"] to ["user_id","notification_category_id"], changing row ordering used for deep comparison between Postgres and ClickHouse.
Pre-verification external sync
apps/backend/scripts/verify-data-integrity/index.ts
Added import { syncExternalDatabases } and an await syncExternalDatabases(tenancy) call before verifyClickhouseSync(...) to flush external DB sync work prior to ClickHouse verification.

Sequence Diagram(s)

sequenceDiagram
  participant Verifier as VerifyScript
  participant Syncer as ExternalDBSync
  participant ClickHouse as ClickHouseVerifier
  participant Postgres as PostgresDB

  Verifier->>Syncer: syncExternalDatabases(tenancy)
  Syncer-->>Verifier: sync complete
  Verifier->>ClickHouse: verifyClickhouseSync(tenancy)
  ClickHouse->>Postgres: query expected rows
  ClickHouse->>ClickHouse: compare ordered rows (SORT_KEYS)
  ClickHouse-->>Verifier: verification result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Poem

🐰
I hopped in code to nudge the stream,
A sync before the ClickHouse dream,
Crons now start with test-tuned art,
Rows align and tests take part,
I twitch my nose — CI passes, beam! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: fixing cron jobs to use test env instead of dev env in CI workflows, which aligns with the primary changes in the two workflow files.
Description check ✅ Passed The description clearly explains the issue (cron jobs using dev env instead of test env) and its consequence (ClickHouse sync mismatches), adequately documenting the purpose of the changes.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/cron-jobs-test-env

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 10, 2026

Greptile Summary

This PR fixes three CI workflow steps that were running cron jobs with the dev environment instead of the test environment, causing ClickHouse sync mismatches when verify-data-integrity ran afterward. The fix aligns db-migration-backwards-compatibility.yaml (both the backwards-compatibility and forward-compatibility jobs) and e2e-custom-base-port-api-tests.yaml with the pattern already used in e2e-api-tests.yaml and e2e-api-tests-local-emulator.yaml, which correctly call run-cron-jobs:test (backed by dotenv -c test).

Confidence Score: 5/5

Safe to merge — targeted, correct fix with no logic issues.

All three changed lines are straightforward environment-name corrections that align the affected workflows with the already-working e2e-api-tests.yaml pattern. The run-cron-jobs:test npm script is defined in package.json and correctly invokes dotenv -c test. No new logic is introduced and no other concerns were found.

No files require special attention.

Important Files Changed

Filename Overview
.github/workflows/db-migration-backwards-compatibility.yaml Corrects two cron-job launch steps (backwards-compatibility and forward-compatibility jobs) to use run-cron-jobs:test instead of the old with-env:dev tsx scripts/run-cron-jobs.ts invocation.
.github/workflows/e2e-custom-base-port-api-tests.yaml Corrects the cron-job launch step to use run-cron-jobs:test instead of the bare run-cron-jobs script (which hard-codes the dev env), making it consistent with the other e2e workflows.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CI Workflow Triggered] --> B[Build & Install Dependencies]
    B --> C[Copy .env.development → .env.test.local]
    C --> D[Start Backend / Dashboard / OAuth Server]
    D --> E[Start run-email-queue\npnpm run run-email-queue\n uses with-env:dev]
    D --> F["Start run-cron-jobs\n(BEFORE: with-env:dev)\n(AFTER: run-cron-jobs:test → with-env:test)"]
    E --> G[Run Tests]
    F --> G
    G --> H[verify-data-integrity]
    H --> I{ClickHouse in sync?}
    I -- "BEFORE fix: NO\n(dev env ≠ test env)" --> J[❌ Failure]
    I -- "AFTER fix: YES\n(both use test env)" --> K[✅ Pass]
Loading

Reviews (1): Last reviewed commit: "Fix cron jobs using dev env instead of t..." | Re-trigger Greptile

Two issues:
1. verify-data-integrity runs before all QStash sync callbacks have
   been delivered, causing stale data in ClickHouse. Fix by calling
   syncExternalDatabases() directly before verifying each tenancy.
2. notification_preferences sort keys used ["id"] but the ClickHouse
   table has no id column — it uses (user_id, notification_category_id)
   as the composite key. Fix sort keys to match.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/backend/scripts/verify-data-integrity/index.ts`:
- Around line 232-235: syncExternalDatabases(tenancy) currently has its return
value ignored which lets verify run despite pending/throttled sync work; update
the call site to inspect the returned flag (e.g., needsResync) and implement a
bounded retry loop (with a small backoff and max attempts) that re-invokes
syncExternalDatabases and only proceeds to ClickHouse verification when it
returns false, otherwise fail fast by throwing an error after exhausting
retries; reference the syncExternalDatabases return handling and the subsequent
ClickHouse verification call so the change affects the awaiting logic before
verification.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d5a8d561-d394-493a-9b98-00db7cd59589

📥 Commits

Reviewing files that changed from the base of the PR and between 4e402e2 and 1d1dd0f.

📒 Files selected for processing (2)
  • apps/backend/scripts/verify-data-integrity/clickhouse-sync-verifier.ts
  • apps/backend/scripts/verify-data-integrity/index.ts

Comment on lines +232 to +235
// Flush any pending ClickHouse syncs by running a direct sync before verifying.
// This avoids race conditions where QStash hasn't delivered all sync callbacks yet.
await syncExternalDatabases(tenancy);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Handle syncExternalDatabases() return value before verifying ClickHouse.

Line 234 currently ignores the needsResync signal, so verification can still run with pending/throttled sync work.

🔧 Proposed fix (bounded retry + fail fast)
         await recurse("[clickhouse sync]", async (recurse) => {
           // Flush any pending ClickHouse syncs by running a direct sync before verifying.
           // This avoids race conditions where QStash hasn't delivered all sync callbacks yet.
-          await syncExternalDatabases(tenancy);
+          const MAX_SYNC_ATTEMPTS = 5;
+          for (let attempt = 1; attempt <= MAX_SYNC_ATTEMPTS; attempt++) {
+            const needsResync = await syncExternalDatabases(tenancy);
+            if (!needsResync) break;
+            if (attempt === MAX_SYNC_ATTEMPTS) {
+              throw new StackAssertionError(
+                `ClickHouse sync still pending after ${MAX_SYNC_ATTEMPTS} attempts for project ${projectId}.`
+              );
+            }
+            await wait(250);
+          }

           await verifyClickhouseSync({
             tenancy,
             projectId,
             branchId: DEFAULT_BRANCH_ID,

As per coding guidelines: "Fail early, fail loud; fail fast with an error instead of silently continuing" and "Validate all assumptions through type system (preferred), assertions, or tests; ideally two out of three".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/backend/scripts/verify-data-integrity/index.ts` around lines 232 - 235,
syncExternalDatabases(tenancy) currently has its return value ignored which lets
verify run despite pending/throttled sync work; update the call site to inspect
the returned flag (e.g., needsResync) and implement a bounded retry loop (with a
small backoff and max attempts) that re-invokes syncExternalDatabases and only
proceeds to ClickHouse verification when it returns false, otherwise fail fast
by throwing an error after exhausting retries; reference the
syncExternalDatabases return handling and the subsequent ClickHouse verification
call so the change affects the awaiting logic before verification.

@BilalG1 BilalG1 merged commit 9e342da into dev Apr 10, 2026
34 of 36 checks passed
@BilalG1 BilalG1 deleted the fix/cron-jobs-test-env branch April 10, 2026 04:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant