feat(graphile-llm): wire billing metering into LLM plugins by pyramation · Pull Request #1192 · constructive-io/constructive

pyramation · 2026-05-18T20:44:07Z

Summary

Wire billing metering infrastructure into the graphile-llm package with clean architectural separation between pure LLM plugins and opt-in billing.

Key design: meter slug = model name

Each LLM model gets its own billing meter. The meter slug IS the model name — no mapping, no guessing:

INSERT INTO meters (slug, display_name, meter_type, aggregation, credit_cost, category_meter)
VALUES
  ('inference',              'Inference',              'usage_pool', 'cumulative', 1, NULL),
  ('text-embedding-3-small', 'text-embedding-3-small', 'quota', 'cumulative', 1, 'inference'),
  ('gpt-4o-mini',           'gpt-4o-mini',            'quota', 'cumulative', 3, 'inference');

Three-level waterfall (handled by billing module's category_meter):

per-model meter → inference usage pool → universal credits

Architecture

Pure plugins (no billing dependency):

LlmModulePlugin — resolves embedder + chat completer from config/env, exposes model names on build
LlmTextSearchPlugin — adds text: String to VectorNearbyInput, embeds text server-side
LlmTextMutationPlugin — adds {column}Text: String companion fields on mutation inputs

Opt-in metering (separate plugin):

LlmMeteringPlugin — transparently wraps build.llmEmbedder with billing quota checks + usage recording via AsyncLocalStorage
Reads model names from build (llmEmbeddingModel, llmChatModel) and uses them as default meter slugs
Entity ID resolved via configurable resolveEntityId callback (default: jwt.claims.user_id)

Supporting utilities:

config-cache.ts — LRU cache (5-min TTL, max 50) for billing_module metadata per database_id. Schema-existence guard checks information_schema.schemata before querying.
metering.ts — meteredEmbed() / meteredChat() wrappers

Usage

// Without metering (pure LLM only)
GraphileLlmPreset({
  defaultEmbedder: { provider: 'openai', model: 'text-embedding-3-small' },
})

// With metering (meter slug = 'text-embedding-3-small')
GraphileLlmPreset({
  defaultEmbedder: { provider: 'openai', model: 'text-embedding-3-small' },
  metering: true,
})

// Bill per-database (platform tier)
GraphileLlmPreset({
  defaultEmbedder: { provider: 'openai', model: 'text-embedding-3-small' },
  metering: { resolveEntityId: (pg) => pg['jwt.claims.database_id'] },
})

Two-tier billing model

Documented in docs/spec/llm-metering.md:

Platform tier: entity_id = database_id, Constructive bills the DB owner
Tenant tier: entity_id = actor_id, DB owner tracks per-user usage (optional, only if billing_module provisioned)

Review & Testing Checklist for Human

Verify the AsyncLocalStorage approach in metering-plugin.ts handles concurrent requests correctly
Confirm information_schema.schemata guard in config-cache.ts works for databases without metaschema_modules_public
Review resolveEntityId default (jwt.claims.user_id) — for platform-tier billing, you'd want jwt.claims.database_id instead
Test with metering disabled (default) — should work identically to before
Review docs/spec/llm-metering.md for accuracy of the two-tier billing model and waterfall design
Verify per-model meters are seeded in the billing_module with category_meter = 'inference'

Recommended test plan: Set up a database with billing_module, seed model-name meters with category_meter = 'inference', configure metering: true, and run embedding queries. Verify record_usage entries appear in the ledger under the model name slug.

Notes

The spec at docs/spec/llm-metering.md is the canonical reference for the metering architecture. It covers the two-tier model, meter slug convention, waterfall, and graceful degradation.
RAG plugin metering is not yet wired (Phase 2).
Two-tier double-write (platform + tenant) is designed but not yet implemented — current code writes to one tier only.

Link to Devin session: https://app.devin.ai/sessions/2b5a29d83d3f478e8d3d972653b4879c
Requested by: @pyramation

Add per-database billing integration to the graphile-llm package: - config-cache.ts: LRU cache (5-min TTL, 50 entries) for billing_module metadata and API key resolution from app_secrets per database_id - metering.ts: billing-aware wrappers (meteredEmbed, meteredChat) that call check_billing_quota() before and record_usage() after LLM calls - LlmModulePlugin: exposes metering options on the build context - LlmTextSearchPlugin: metered embedding with graceful degradation — when quota exceeded, skips vector path (text-only search continues) - LlmTextMutationPlugin: metered embedding that throws QuotaExceededError on mutations (can't silently skip writing a vector the user asked for) - MeteringConfig on GraphileLlmOptions: configurable meter slugs, estimated tokens, and skip toggle. Auto-detects billing_module. - Uses Graphile withPgClient pattern for all billing SQL calls Billing functions (check_billing_quota, record_usage) are resolved from the tenant database's billing_module metaschema. When billing is not provisioned, all calls pass through unmetered.

devin-ai-integration · 2026-05-18T20:44:10Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

…Plugin - Extract all billing/metering logic into metering-plugin.ts - LlmModulePlugin, TextSearchPlugin, TextMutationPlugin are now pure (no billing imports, no metering context building) - LlmMeteringPlugin uses AsyncLocalStorage to transparently wrap the embedder with quota checks — downstream plugins are unaware of billing - Entity ID resolved via configurable callback (default: jwt.claims.user_id) instead of non-existent jwt.claims.membership_id - Metering is opt-in: only loaded when metering option is truthy - Add schema-existence guard in config-cache (checks metaschema_modules_public exists before querying billing_module table) - Graceful degradation: missing schema, missing entity_id, or failed billing calls all result in unmetered passthrough

- LlmModulePlugin now exposes llmEmbeddingModel and llmChatModel on build - LlmMeteringPlugin reads model names from build and uses them as default meter slugs (e.g. 'text-embedding-3-small' → billing meters table) - Three-level waterfall: per-model → inference pool → universal credits (handled by billing module's category_meter field) - Remove hardcoded 'embedding_tokens'/'chat_tokens' defaults - Add docs/spec/llm-metering.md — full architecture reference for two-tier billing, model=meter slug convention, and waterfall

…user_module The table was renamed from metaschema_modules_public.encrypted_secrets_module to metaschema_modules_public.config_secrets_user_module. Also updated the JOIN column from private_schema_id to schema_id to match the new schema.

…t length Removed the configurable estimatedEmbeddingTokens option — token counts are now estimated directly from the input text length (~4 chars/token). No tokenizer needed since the billing system uses tokens as abstract units and the credit_cost per model normalizes relative expense.

devin-ai-integration Bot assigned pyramation May 18, 2026

pyramation added 5 commits May 18, 2026 21:01

remove docs/spec/llm-metering.md (moving to constructive-db)

52b603a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(graphile-llm): wire billing metering into LLM plugins#1192

feat(graphile-llm): wire billing metering into LLM plugins#1192
pyramation wants to merge 6 commits into
mainfrom
feat/llm-billing-metering

pyramation commented May 18, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

devin-ai-integration Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pyramation commented May 18, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Key design: meter slug = model name

Architecture

Usage

Two-tier billing model

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration Bot commented May 18, 2026

🤖 Devin AI Engineer

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pyramation commented May 18, 2026 •

edited by devin-ai-integration Bot

Loading