Skip to content

fix: wire generated_by_model into observation write path#1602

Closed
alessandropcostabr wants to merge 3 commits intothedotmack:mainfrom
alessandropcostabr:fix/wire-generated-by-model
Closed

fix: wire generated_by_model into observation write path#1602
alessandropcostabr wants to merge 3 commits intothedotmack:mainfrom
alessandropcostabr:fix/wire-generated-by-model

Conversation

@alessandropcostabr
Copy link
Copy Markdown
Contributor

Summary

  • The generated_by_model column was added to the observations table in the Phase 0 governance schema migration but never wired into any INSERT statement
  • All 3,878+ observations in production have this field as NULL
  • This fix threads the model ID from each agent (SDKAgent, GeminiAgent, OpenRouterAgent) through processAgentResponse() into all three store methods

Motivation

This unblocks the Thompson Sampling RFC (#1571) which needs {observation_type}:{model} as the bandit arm key. Without knowing which model generated each observation, the bandit cannot learn optimal model routing.

Changes

File Change
SessionStore.ts Added generated_by_model to INSERT in storeObservation(), storeObservations(), and storeObservationsAndMarkComplete()
ResponseProcessor.ts Added optional modelId parameter, passed through to storeObservations()
SDKAgent.ts Passes modelId (from session.modelOverride || this.getModelId())
GeminiAgent.ts Passes model (from this.getGeminiConfig()) to all 3 call sites
OpenRouterAgent.ts Passes model (from this.getOpenRouterConfig()) to all 3 call sites

Test plan

  • Verify new observations get generated_by_model populated in SQLite
  • Verify existing observations (NULL) are unaffected
  • Verify MemoryRoutes.ts manual observations still work (no model param = NULL, expected)
  • Query SELECT generated_by_model, COUNT(*) FROM observations GROUP BY generated_by_model after a few sessions

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com

The generated_by_model column was added to the observations table in the
Phase 0 governance schema migration but never wired into the INSERT
statements. All 3,878+ observations in production have this field NULL.

This fix threads the model ID from each agent (SDKAgent, GeminiAgent,
OpenRouterAgent) through processAgentResponse() into storeObservation(),
storeObservations(), and storeObservationsAndMarkComplete().

Unblocks Thompson Sampling RFC (thedotmack#1571) which needs {obs_type}:{model}
as the bandit arm key.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 5, 2026

Summary by CodeRabbit

  • New Features
    • Observations and summaries now record which AI model generated them across all providers.
  • Chores
    • Database schema updated via a migration to add model attribution and a relevance counter.
  • Bug Fixes
    • Agent processing now consistently passes model identifiers so generation attribution is reliable.

Walkthrough

Agents now forward a model identifier through the response processing pipeline into the session store; the session store records that identifier in a new generated_by_model column on observations. A migration and migration runner changes add the column (and relevance_count) to the DB when missing.

Changes

Cohort / File(s) Summary
Agent Response Handlers
src/services/worker/GeminiAgent.ts, src/services/worker/OpenRouterAgent.ts, src/services/worker/SDKAgent.ts
Calls to processAgentResponse updated to include the selected model identifier (model / modelId) in init, observation, and summary handling.
Response Processing
src/services/worker/agents/ResponseProcessor.ts
processAgentResponse signature extended with optional modelId?: string; propagated into session store calls.
Session Store
src/services/sqlite/SessionStore.ts
storeObservation, storeObservations, storeObservationsAndMarkComplete signatures accept generatedByModel?: string; INSERT statements updated to include generated_by_model and pass `generatedByModel
Migrations
src/services/sqlite/migrations.ts, src/services/sqlite/migrations/runner.ts
Added migration009 (version 26) and migration runner logic ensureGeneratedByModelColumn() to conditionally add generated_by_model TEXT and relevance_count INTEGER DEFAULT 0 to observations if absent; migration appended to exports and invoked during runner flow.

Sequence Diagram

sequenceDiagram
    participant Agent as Agent (Gemini/OpenRouter/SDK)
    participant Processor as ResponseProcessor
    participant Store as SessionStore
    participant DB as SQLite DB

    Agent->>Processor: processAgentResponse(..., modelId)
    Note over Processor: receives model identifier
    Processor->>Store: storeObservations(..., generatedByModel)
    Note over Store: prepares INSERT including generated_by_model
    Store->>DB: INSERT INTO observations (..., generated_by_model ...)
    DB-->>Store: OK
    Store-->>Processor: observation IDs
    Processor-->>Agent: complete
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I hopped through code and schema rows,

I tagged each thought where provenance goes,
From agent voice to SQLite dream,
I left a trail — a tiny seam,
A rabbit’s hop keeps sources close.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: wiring the generated_by_model field into observation write paths.
Description check ✅ Passed The description comprehensively explains the problem, motivation, changes, and test plan—all directly related to the changeset.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 5, 2026
The compiled binary (v10.6.3) creates these columns at runtime via
MigrationRunner, but no corresponding migration exists in the TypeScript
source. Anyone building from source gets observations without these
columns, breaking the feedback pipeline and model tracking.

This migration conditionally adds both columns using PRAGMA table_info
checks, making it safe for databases that already have them.

Refs: thedotmack#1626

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alessandropcostabr
Copy link
Copy Markdown
Contributor Author

Added migration009 for generated_by_model and relevance_count columns on the observations table.

Why: The compiled binary (v10.6.3) creates these columns via MigrationRunner at runtime, but no migration exists in the TypeScript source. Building from source produces a database without these columns, breaking model tracking and the feedback pipeline.

What: Conditionally adds both columns using PRAGMA table_info checks — safe for databases that already have them (no-op).

This is a companion fix to #1626 (schema mismatch between source and binary).

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
src/services/sqlite/migrations.ts (3)

514-515: Orphaned comment from previous edit.

The comment "All migrations in order" at lines 514–515 appears orphaned—it's followed by migration008's definition rather than the migrations array. The correct comment exists at lines 575–577. Consider removing this duplicate.

🧹 Suggested cleanup
 };
 
-
-/**
- * All migrations in order
- */
 /**
  * Migration 008: Observation feedback table for tracking observation usage
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/sqlite/migrations.ts` around lines 514 - 515, Remove the
orphaned duplicate comment "All migrations in order" that appears immediately
before the definition of migration008; keep the existing correct comment that
precedes the migrations array (the one at lines around the migrations
declaration) and ensure only that comment remains to document the migrations
array and not the standalone migration008 declaration.

559-561: Consider using proper typing instead of any.

The codebase appears to have a TableColumnInfo type (seen in context snippet from SessionStore.ts line 278). Using explicit types would improve type safety and maintainability.

♻️ Suggested improvement
-    const columns = db.prepare('PRAGMA table_info(observations)').all() as any[];
-    const hasGeneratedByModel = columns.some((c: any) => c.name === 'generated_by_model');
-    const hasRelevanceCount = columns.some((c: any) => c.name === 'relevance_count');
+    const columns = db.prepare('PRAGMA table_info(observations)').all() as { name: string }[];
+    const hasGeneratedByModel = columns.some((c) => c.name === 'generated_by_model');
+    const hasRelevanceCount = columns.some((c) => c.name === 'relevance_count');

Or import/define a shared TableColumnInfo interface if one exists in the codebase.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/sqlite/migrations.ts` around lines 559 - 561, The code uses any
for PRAGMA results: replace the any[] cast on columns with a concrete type
(e.g., TableColumnInfo[]) by importing or defining the shared TableColumnInfo
interface and use it in the expression where you call db.prepare('PRAGMA
table_info(observations)').all() so columns is typed, and then update the
callbacks for hasGeneratedByModel and hasRelevanceCount to use that type (e.g.,
(c: TableColumnInfo) => c.name === 'generated_by_model'), ensuring you import
TableColumnInfo from the same module used elsewhere (or add a local interface if
none exists).

556-573: Add success logging for consistency with other migrations.

All other migrations (001–008) include console.log statements indicating success. Adding similar logging here would maintain consistency and aid debugging.

♻️ Suggested improvement
     if (!hasRelevanceCount) {
       db.run('ALTER TABLE observations ADD COLUMN relevance_count INTEGER DEFAULT 0');
     }
+
+    console.log('✅ Added generated_by_model and relevance_count columns to observations table');
   },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/sqlite/migrations.ts` around lines 556 - 573, Add a success log
to migration009's up function so it mirrors other migrations: after adding the
columns (i.e., after the db.run calls in migration009.up and after any
conditional branches that may or may not run), call console.log with a clear
message like "migration009 applied: observations columns updated" to indicate
success; ensure the message runs regardless of whether columns were already
present or newly added so the migration always emits a consistent success log.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/services/sqlite/migrations.ts`:
- Around line 544-573: migration009 (version 26) that adds generated_by_model
and relevance_count is defined but not executed by the migration runner, causing
INSERTs from SessionStore (SessionStore methods that write generated_by_model)
to fail; fix by adding a method on the migration runner (e.g.,
addObservationOptionalColumns) which checks PRAGMA table_info(observations) for
generated_by_model and relevance_count, runs ALTER TABLE to add them if missing,
and inserts version 26 into schema_versions, and then call this new method from
runAllMigrations() immediately after createObservationFeedbackTable() so
migration009's changes are applied during normal migration runs.

---

Nitpick comments:
In `@src/services/sqlite/migrations.ts`:
- Around line 514-515: Remove the orphaned duplicate comment "All migrations in
order" that appears immediately before the definition of migration008; keep the
existing correct comment that precedes the migrations array (the one at lines
around the migrations declaration) and ensure only that comment remains to
document the migrations array and not the standalone migration008 declaration.
- Around line 559-561: The code uses any for PRAGMA results: replace the any[]
cast on columns with a concrete type (e.g., TableColumnInfo[]) by importing or
defining the shared TableColumnInfo interface and use it in the expression where
you call db.prepare('PRAGMA table_info(observations)').all() so columns is
typed, and then update the callbacks for hasGeneratedByModel and
hasRelevanceCount to use that type (e.g., (c: TableColumnInfo) => c.name ===
'generated_by_model'), ensuring you import TableColumnInfo from the same module
used elsewhere (or add a local interface if none exists).
- Around line 556-573: Add a success log to migration009's up function so it
mirrors other migrations: after adding the columns (i.e., after the db.run calls
in migration009.up and after any conditional branches that may or may not run),
call console.log with a clear message like "migration009 applied: observations
columns updated" to indicate success; ensure the message runs regardless of
whether columns were already present or newly added so the migration always
emits a consistent success log.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 71404c81-a126-4b37-8695-2ead688015a2

📥 Commits

Reviewing files that changed from the base of the PR and between c3e5f3a and 5e69688.

📒 Files selected for processing (1)
  • src/services/sqlite/migrations.ts

Comment thread src/services/sqlite/migrations.ts
@alessandropcostabr
Copy link
Copy Markdown
Contributor Author

Self-review: Fix needed — migration009 not registered in MigrationRunner

The PR works locally (18 observations tagged with generated_by_model — 17 claude-sonnet-4-5, 1 claude-opus-4-6). However, as CodeRabbit correctly identified:

Bug: migration009 is defined in migrations.ts but not added to the MigrationRunner in runner.ts. On fresh installs, the generated_by_model and related columns won't be created, causing INSERT failures.

Fix: Add migration009() to the migrations array in runner.ts.

Interconnection note: This PR is a prerequisite for RFC #1571 (Thompson Sampling) — the bandit needs generated_by_model populated to form {type}:{model} arms. Also partially addresses #1626 (schema mismatch between source and compiled binary).

Will push the fix shortly.

The migration was defined in migrations.ts but never called by the runner.
On fresh installs, the generated_by_model and relevance_count columns
would not be created, causing INSERT failures.

Addresses CodeRabbit review feedback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alessandropcostabr
Copy link
Copy Markdown
Contributor Author

Fix pushed: migration009 is now registered in MigrationRunner.runAllMigrations().

Fresh installs will correctly create generated_by_model and relevance_count columns via ensureGeneratedByModelColumn() (version 26, conditional — safe for existing DBs).

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/services/sqlite/migrations/runner.ts (1)

902-919: Implementation is correct and idempotent; minor consistency nits below.

The migration logic is sound—early return on recorded version, column existence checks via PRAGMA, and INSERT OR IGNORE for the version record.

Two optional consistency tweaks to align with other methods in this file:

  1. Line 906: Use this.db.query() instead of this.db.prepare() for the PRAGMA (matches ensureWorkerPortColumn, ensurePromptTrackingColumns, etc.).
  2. Line 918: Log per-column additions inside the if blocks rather than unconditionally at the end (matches ensureWorkerPortColumn at line 135).
♻️ Suggested consistency improvements
   private ensureGeneratedByModelColumn(): void {
     const applied = this.db.query('SELECT 1 FROM schema_versions WHERE version = 26').get();
     if (applied) return;

-    const columns = this.db.prepare('PRAGMA table_info(observations)').all() as TableColumnInfo[];
+    const columns = this.db.query('PRAGMA table_info(observations)').all() as TableColumnInfo[];
     const hasGeneratedByModel = columns.some((c) => c.name === 'generated_by_model');
     const hasRelevanceCount = columns.some((c) => c.name === 'relevance_count');

     if (!hasGeneratedByModel) {
       this.db.run('ALTER TABLE observations ADD COLUMN generated_by_model TEXT');
+      logger.debug('DB', 'Added generated_by_model column to observations table');
     }
     if (!hasRelevanceCount) {
       this.db.run('ALTER TABLE observations ADD COLUMN relevance_count INTEGER DEFAULT 0');
+      logger.debug('DB', 'Added relevance_count column to observations table');
     }

     this.db.prepare('INSERT OR IGNORE INTO schema_versions (version, applied_at) VALUES (?, ?)').run(26, new Date().toISOString());
-    logger.debug('DB', 'Ensured generated_by_model and relevance_count columns exist');
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/services/sqlite/migrations/runner.ts` around lines 902 - 919, The
ensureGeneratedByModelColumn method should use this.db.query(...) for the PRAGMA
check (instead of this.db.prepare(...)) and move the logging into the per-column
addition branches: in ensureGeneratedByModelColumn, replace the PRAGMA call to
use this.db.query('PRAGMA table_info(observations)').all() and, inside the if
(!hasGeneratedByModel) branch log that generated_by_model was added and inside
the if (!hasRelevanceCount) branch log that relevance_count was added (keep the
INSERT OR IGNORE into schema_versions as-is and retain the early return on
version 26).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/services/sqlite/migrations/runner.ts`:
- Around line 902-919: The ensureGeneratedByModelColumn method should use
this.db.query(...) for the PRAGMA check (instead of this.db.prepare(...)) and
move the logging into the per-column addition branches: in
ensureGeneratedByModelColumn, replace the PRAGMA call to use
this.db.query('PRAGMA table_info(observations)').all() and, inside the if
(!hasGeneratedByModel) branch log that generated_by_model was added and inside
the if (!hasRelevanceCount) branch log that relevance_count was added (keep the
INSERT OR IGNORE into schema_versions as-is and retain the early return on
version 26).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fc592d85-0271-400d-a291-d1d722073036

📥 Commits

Reviewing files that changed from the base of the PR and between 5e69688 and 9b55403.

📒 Files selected for processing (1)
  • src/services/sqlite/migrations/runner.ts

coderabbitai[bot]
coderabbitai Bot previously approved these changes Apr 7, 2026
@thedotmack
Copy link
Copy Markdown
Owner

Closed during the April 2026 backlog cleanup. This was verified already-fixed in v12.1.1 HEAD. Please update to the latest build. If you still see the issue on current version, open a fresh ticket with repro.

@thedotmack thedotmack closed this Apr 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants