You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Rename CodeEvaluator to SystemEvaluator to align with its focus on system-level metrics. A CodeEvaluator alias is kept in evaluators.py for backward-compatibility.
Combine multiple evaluators (`CodeEvaluator` + `LLMAsJudge` + custom functions) into a single aggregated verdict using configurable scoring strategies.
538
+
Combine multiple evaluators (`SystemEvaluator` + `LLMAsJudge` + custom functions) into a single aggregated verdict using configurable scoring strategies.
@@ -248,7 +248,7 @@ Aggregations, filtering, joins, and even LLM evaluation (via `AI.GENERATE`) are
248
248
LLM-based evaluation can run via (1) BigQuery `AI.GENERATE`, (2) legacy BigQuery ML `ML.GENERATE_TEXT`, or (3) the Gemini API directly. This maximizes compatibility across different GCP configurations.
249
249
250
250
**Decision 4: Composition over inheritance.**
251
-
The `GraderPipeline` composes `CodeEvaluator`, `LLMAsJudge`, and custom functions via a builder pattern rather than requiring them to share a common base class. The `BigQueryMemoryService` composes four internal services rather than extending a single monolithic class.
251
+
The `GraderPipeline` composes `SystemEvaluator`, `LLMAsJudge`, and custom functions via a builder pattern rather than requiring them to share a common base class. The `BigQueryMemoryService` composes four internal services rather than extending a single monolithic class.
252
252
253
253
---
254
254
@@ -396,7 +396,7 @@ Each field generates a separate `AND` condition with a corresponding `bigquery.S
396
396
397
397
This module contains two evaluator classes and the SQL templates that power batch evaluation.
398
398
399
-
#### 4.3.1 `CodeEvaluator`
399
+
#### 4.3.1 `SystemEvaluator`
400
400
401
401
Deterministic evaluation using code-defined metric functions.
402
402
@@ -626,7 +626,7 @@ Combines heterogeneous evaluators into a unified verdict using a strategy patter
Copy file name to clipboardExpand all lines: docs/implementation_plan_concept_index_runtime.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -165,7 +165,7 @@ Work: `bigquery_ontology/contrib/advertising/` stub with Yahoo's resolver (if co
165
165
-`src/bigquery_ontology/graph_ddl_compiler.py` — add `compile_concept_index(ontology, binding, *, output_table) -> str`. Preserve `compile_graph()` contract byte-identically. No changes to existing function bodies.
166
166
-`src/bigquery_ontology/cli.py:299` — `compile` command gains `--emit-concept-index` and `--concept-index-table` flags. When absent, behavior is byte-identical to today.
167
167
-`src/bigquery_ontology/__init__.py` — add `from .graph_ddl_compiler import compile_concept_index` so the new public function is importable as `from bigquery_ontology import compile_concept_index`, matching the existing pattern for `compile_graph` (`__init__.py:50` today).
168
-
-`src/bigquery_agent_analytics/__init__.py` — add the new public surface to the try/except re-export block (same pattern as `Client`, `CodeEvaluator`, etc.):
168
+
-`src/bigquery_agent_analytics/__init__.py` — add the new public surface to the try/except re-export block (same pattern as `Client`, `SystemEvaluator`, etc.):
169
169
-`OntologyRuntime` from `.ontology_runtime`
170
170
-`EntityResolver`, `ExactMatchResolver`, `SynonymResolver`, `Candidate`, `ResolveResult` from `.entity_resolver`
171
171
-`ConceptIndexMismatchError`, `ConceptIndexProvenanceMissing`, `ConceptIndexInconsistentPair`, `ConceptIndexRefreshed` from `.ontology_runtime`
|`Client.deep_analysis()` / question distribution | Partial | SQL does grouping / embeddings / top-k; UDF can help with categorization or normalization |
177
177
|`Client.drift_detection()`| Partial | SQL computes set logic; UDF may help with text normalization or thresholding |
178
178
|`Client.insights()`| Partial | Best split into SQL extraction + optional UDF post-processing; not a direct port |
@@ -224,7 +224,7 @@ That is maintainable. Reusing the entire client inside a Python UDF is not.
224
224
225
225
The current evaluator score math is not implemented as standalone top-level
226
226
functions today. It lives inside factory-method closures such as
227
-
`CodeEvaluator.latency()` and `CodeEvaluator.error_rate()` in
227
+
`SystemEvaluator.latency()` and `SystemEvaluator.error_rate()` in
0 commit comments