feat(graphile-llm): add inference usage logging to metering plugin#1196
Merged
Conversation
Contributor
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
d2463b7 to
df79e9e
Compare
b5d6099 to
9c0265c
Compare
df79e9e to
6b9fab7
Compare
Adds inline INSERT into usage_log_inference table after billing record_usage calls in both meteredEmbed and meteredChat functions. Changes: - config-cache.ts: Add InferenceLogConfig type and resolution from inference_log_module metaschema table (cached alongside billing config) - metering.ts: Add InferenceLogEntry type, logInferenceUsage helper, and calls after billing in meteredEmbed/meteredChat (including quota_exceeded events). Add databaseId, actorId, inferenceLog to MeteringContext. Add embeddingModel, chatModel, provider to MeteringOptions. - metering-plugin.ts: Wire databaseId, actorId, inferenceLog into MeteringContext. Pass model names to MeteringOptions. - index.ts: Export new types (InferenceLogEntry, InferenceLogConfig) and logInferenceUsage function. Gracefully skips if inference_log_module is not provisioned. TODO: dual-write to child (generated) database for platform aggregation.
6b9fab7 to
fb532f3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds inference usage logging to the LLM metering plugin. Every metered LLM call (embed or chat) now INSERTs into the
usage_log_inferencetable with full usage metadata.Changes:
metering.ts: addslogInferenceUsage()helper that INSERTs into the inference log table; addsdatabaseId,actorId,inferenceLogtoMeteringContext; bothmeteredEmbedandmeteredChatnow log usage on success and quota_exceededmetering-plugin.ts: extractsdatabaseId,actorIdfrom pgSettings; queriesinferenceLogconfig fromgetLlmBillingConfig()config-cache.ts: addsinferenceLogto theModuleConfigCache— queriesinference_log_moduleconfig alongside billing configplaceholderAmountTokens(~4 chars/token estimation) — clearly marked as placeholders pending team approval ofgenerateWithUsage()on agentic-kitDepends on: #1192 (billing metering + cache standardization, now merged)
Review & Testing Checklist for Human
usage_log_inferencerows appear after an LLM embed/chat call with correctmodel,provider,request_type, token counts,latency_ms, andstatusquota_exceededevents are logged (not just successes)placeholderAmountTokensvariables are clearly placeholder — search for the comment "replace with actual provider token counts once generateWithUsage() is approved"Notes
Token estimation is intentionally a placeholder (
Math.ceil(text.length / 4)). The swap-out point is clearly marked in the code. OncegenerateWithUsage()is approved on agentic-kit, replaceplaceholderAmountTokenswith actual provider counts.Link to Devin session: https://app.devin.ai/sessions/2b5a29d83d3f478e8d3d972653b4879c
Requested by: @pyramation