fix analytics queries by BilalG1 · Pull Request #1141 · hexclave/stack-auth

BilalG1 · 2026-01-28T23:45:14Z

Summary by CodeRabbit

Improvements
- Added automatic retry with stricter result validation and clearer error handling for query profiling to improve reliability.
Chores
- CI workflows updated to use larger runner instances (upgraded runner size across relevant jobs).

_{✏️ Tip: You can customize this high-level summary in your review settings.}

vercel · 2026-01-28T23:45:19Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Review	Updated (UTC)
stack-backend	Ready	Preview, Comment	Jan 29, 2026 5:51pm
stack-dashboard	Ready	Preview, Comment	Jan 29, 2026 5:51pm
stack-demo	Ready	Preview, Comment	Jan 29, 2026 5:51pm
stack-docs	Ready	Preview, Comment	Jan 29, 2026 5:51pm

coderabbitai · 2026-01-28T23:45:32Z

📝 Walkthrough

Walkthrough

Add a retry wrapper around ClickHouse profiling queries to enforce exactly one result with backoff retries; update several GitHub Actions workflows to use ubicloud-standard-16 runners (runner image bump only).

Changes

Cohort / File(s)	Summary
ClickHouse Query Resilience `apps/backend/src/lib/clickhouse.tsx`	Extracted the query into an internal `queryProfile` function and added a retry loop with delays `[75, 150, 300, 600, 1200]` ms. Enforces single-row result (returns the row), throws on multiple rows, and throws if zero results after all retries. Explicit result typing and JSON extraction added.
CI Workflows — runner bump `.github/workflows/restart-dev-and-test.yaml`, `.github/workflows/restart-dev-and-test-with-custom-base-port.yaml`, `.github/workflows/setup-tests.yaml`, `.github/workflows/setup-tests-with-custom-base-port.yaml`	Changed job runner from `ubicloud-standard-8` to `ubicloud-standard-16` in each workflow; no other workflow logic or steps were modified.

Sequence Diagram(s)

(Skipped)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibbled code beneath the moon,
Tossed retries in a careful tune,
One row to fetch, no noisy throng,
Backoffs tapped a patient song,
CI woke up on a newer rune.

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	The pull request description is entirely empty beyond the contribution template comment. It provides no explanation of the analytics query changes, retry logic implementation, or the rationale for the infrastructure updates.	Add a detailed description explaining the analytics query fix (retry mechanism, error handling), the reason for upgrading runner images, and the impact of these changes.
Title check	❓ Inconclusive	The title 'fix analytics queries' is partially related to the changeset. While the backend analytics query logic was updated with retry mechanisms, the majority of changes involve updating GitHub Actions runner images, which is not reflected in the title.	Clarify the title to reflect both the analytics query fix and the infrastructure changes, or confirm whether the runner upgrades are the primary focus alongside the analytics fix.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

greptile-apps · 2026-01-28T23:48:09Z

Greptile Overview

Greptile Summary

Modified the getQueryTimingStats function to query across all ClickHouse cluster replicas using clusterAllReplicas('default', system.query_log) instead of directly querying system.query_log.

Key Changes:

Query now uses clusterAllReplicas('default', system.query_log) to fetch query timing statistics across all replicas in a clustered ClickHouse setup

Concerns:

The change assumes a cluster named 'default' exists, which may not be true in all environments
The dev docker-compose configuration runs ClickHouse as a single instance without clustering
No tests were added to verify this works in both clustered and non-clustered environments

Confidence Score: 2/5

This PR may break analytics functionality in non-clustered ClickHouse environments
The change assumes ClickHouse is running in cluster mode with a cluster named 'default', but the dev environment and potentially other deployments run ClickHouse as a single instance. This could cause query failures. No tests verify compatibility across different deployment configurations.
apps/backend/src/lib/clickhouse.tsx requires verification that the cluster query works in all environments

Important Files Changed

Filename	Overview
apps/backend/src/lib/clickhouse.tsx	Changed query from `system.query_log` to `clusterAllReplicas('default', system.query_log)` for distributed querying, but may break in non-clustered environments

Sequence Diagram

sequenceDiagram
    participant Client
    participant getQueryTimingStats
    participant ClickHouse as ClickHouse Cluster
    participant QueryLog as system.query_log

    Client->>getQueryTimingStats: Request timing stats for queryId
    getQueryTimingStats->>ClickHouse: SYSTEM FLUSH LOGS (admin auth)
    ClickHouse-->>getQueryTimingStats: Logs flushed
    getQueryTimingStats->>ClickHouse: Query clusterAllReplicas('default', system.query_log)
    ClickHouse->>QueryLog: Fetch from all replicas
    QueryLog-->>ClickHouse: ProfileEvents data
    ClickHouse-->>getQueryTimingStats: cpu_time_ms, wall_clock_time_ms
    alt Query log entry found
        getQueryTimingStats-->>Client: Return timing stats
    else No entry found
        getQueryTimingStats-->>Client: Throw StackAssertionError
    end

greptile-apps

_{1 file reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/backend/src/lib/clickhouse.tsx (1)

52-58: Make cluster name configurable and align log flush with clustered query.

Hardcoding clusterAllReplicas('default', system.query_log) will fail on deployments with non-default cluster names, and the local SYSTEM FLUSH LOGS doesn't flush logs from other replicas, potentially missing query statistics. Configure the cluster name via environment variable and use SYSTEM FLUSH LOGS ON CLUSTER when clustering is enabled.

🔧 Proposed fix (configurable cluster + cluster-aware flush)

 const clickhouseExternalPassword = getEnvVariable("STACK_CLICKHOUSE_EXTERNAL_PASSWORD", "");
 const clickhouseDefaultDatabase = getEnvVariable("STACK_CLICKHOUSE_DATABASE", "default");
+const clickhouseCluster = getEnvVariable("STACK_CLICKHOUSE_CLUSTER", "");
 const HAS_CLICKHOUSE = !!clickhouseUrl && !!clickhouseAdminPassword && !!clickhouseExternalPassword;

   await client.exec({
-    query: "SYSTEM FLUSH LOGS",
+    query: clickhouseCluster
+      ? `SYSTEM FLUSH LOGS ON CLUSTER '${clickhouseCluster}'`
+      : "SYSTEM FLUSH LOGS",
     auth: {
       username: clickhouseAdminUser,
       password: clickhouseAdminPassword,
     },
   });
+  const queryLogSource = clickhouseCluster
+    ? `clusterAllReplicas('${clickhouseCluster}', system.query_log)`
+    : "system.query_log";
   const profile = await client.query({
     query: `
     SELECT
       ProfileEvents['CPUTimeMicroseconds'] / 1000 AS cpu_time_ms,
       ProfileEvents['RealTimeMicroseconds'] / 1000 AS wall_clock_time_ms
-    FROM clusterAllReplicas('default', system.query_log)
+    FROM ${queryLogSource}
     WHERE query_id = {query_id:String} AND type = 'QueryFinish'
     ORDER BY event_time DESC
     LIMIT 1
   `,

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@apps/backend/src/lib/clickhouse.tsx`:
- Around line 51-67: The queryProfile function's ClickHouse query hard-codes
clusterAllReplicas('default', ...) and risks per-replica LIMIT pushdown; update
the client.query call to use the configured cluster name (use the existing
cluster variable or an env var like CLICKHOUSE_CLUSTER instead of 'default') and
append "SETTINGS distributed_push_down_limit = 0" to the SQL to force the LIMIT
on the initiator; modify the query string in queryProfile accordingly so it
references the configurable cluster name and includes the SETTINGS clause.

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

apps/e2e/tests/backend/endpoints/api/v1/analytics-events.test.ts (1)
64-70: Fix inconsistent team_id expectation in the second test.

The backend explicitly sets team_id: null for token refresh events (see apps/backend/src/lib/events.tsx:255), yet the "cannot read events from other projects" test expects team_id: "" (line 55). The first test correctly expects null. Update the snapshot in the second test to match the backend behavior:
Snapshot change required
{
  "branch_id": "main",
  "event_type": "$token-refresh",
  "project_id": "<stripped UUID>",
- "team_id": "",
+ "team_id": null,
  "user_id": "<stripped UUID>",
},

🧹 Nitpick comments (1)

apps/e2e/tests/backend/endpoints/api/v1/analytics-events.test.ts (1)
141-152: Consider typing the row parameter instead of using any.

Lines 143 and 152 use (row: any) without explanation. Per coding guidelines, avoid any or leave a comment explaining why the type system fails. A simple inline type would improve type safety:
type AnalyticsRow = { user_id: string; event_type: string; project_id: string; branch_id: string; team_id: string | null };
♻️ Suggested improvement
+type AnalyticsRow = { user_id: string; event_type: string; project_id: string; branch_id: string; team_id: string | null };
+
 // Then use:
-  expect(userAResults.every((row: any) => row.user_id === userA)).toBe(true);
+  expect(userAResults.every((row: AnalyticsRow) => row.user_id === userA)).toBe(true);
As per coding guidelines: "Avoid the any type. When necessary, leave a comment explaining why it's being used."

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.github/workflows/setup-tests.yaml:
- Line 46: The workflow step uses an invalid command string "pnpm test run
--reporter=verbose" which passes an extra "run" to the test script; update that
step to call "pnpm test --reporter=verbose" (or "pnpm run test
--reporter=verbose") so the package script is invoked correctly—replace the
incorrect command token in the workflow step that currently contains "pnpm test
run --reporter=verbose".

fix analytics queries

637a668

github-actions Bot assigned BilalG1 Jan 28, 2026

greptile-apps Bot reviewed Jan 28, 2026

View reviewed changes

Comment thread apps/backend/src/lib/clickhouse.tsx Outdated

coderabbitai Bot reviewed Jan 28, 2026

View reviewed changes

retry get query timing

7079a40

vercel Bot reviewed Jan 29, 2026

View reviewed changes

Comment thread apps/backend/src/lib/clickhouse.tsx

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

Comment thread apps/backend/src/lib/clickhouse.tsx

fix test

a0b2e39

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

BilalG1 requested a review from N2D4 January 29, 2026 00:11

BilalG1 assigned N2D4 and unassigned BilalG1 Jan 29, 2026

vercel Bot reviewed Jan 29, 2026

View reviewed changes

Comment thread apps/e2e/tests/backend/endpoints/api/v1/analytics-events.test.ts Outdated

N2D4 approved these changes Jan 29, 2026

View reviewed changes

Comment thread apps/backend/src/lib/clickhouse.tsx Outdated

Comment thread apps/e2e/tests/backend/endpoints/api/v1/analytics-events.test.ts Outdated

github-actions Bot assigned BilalG1 and unassigned N2D4 Jan 29, 2026

BilalG1 added 3 commits January 28, 2026 16:27

remove cluster all

fdf6a16

more retries

c1d6fd9

try fix tests

379bc04

vercel Bot reviewed Jan 29, 2026

View reviewed changes

Comment thread .github/workflows/setup-tests.yaml Outdated

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

Comment thread .github/workflows/setup-tests.yaml Outdated

BilalG1 added 6 commits January 28, 2026 18:02

testing fixes

76e0d04

fix tests attempt

7d3c251

fix tests attempt

3130eab

attempt test fixes

9fb0376

increase ubicloud instance size

0f1c48f

Merge remote-tracking branch 'origin/dev' into fix-analytics-queries

4f1847b

BilalG1 added 3 commits January 29, 2026 09:27

fix typecheck

5cf067b

fix merge conflict

077f7e8

revert temp test fixes

db30eed

BilalG1 merged commit 7b5cf4f into dev Jan 29, 2026
27 of 29 checks passed

BilalG1 deleted the fix-analytics-queries branch January 29, 2026 18:22

coderabbitai Bot mentioned this pull request Feb 2, 2026

query timing route #1146

Merged

Conversation

BilalG1 commented Jan 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

vercel Bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Jan 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

greptile-apps Bot commented Jan 28, 2026

Greptile Overview

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

BilalG1 commented Jan 28, 2026 •

edited by coderabbitai Bot

Loading

vercel Bot commented Jan 28, 2026 •

edited

Loading

coderabbitai Bot commented Jan 28, 2026 •

edited

Loading