Feat: bq-agent-sdk ontology-build --skip-property-graph to populate base tables only
Goal
Add an opt-in flag on the ontology-build CLI that runs phases 1–4 (load spec → extract → create tables → materialize) and stops before phase 5 (CREATE OR REPLACE PROPERTY GRAPH). Lets users with a pre-defined BigQuery property graph populate its base tables from BQ AA traces without overwriting their graph DDL on every run.
Motivation
build_ontology_graph(...) (ontology_orchestrator.py:291) runs five phases in sequence:
- Load spec.
- Extract
ExtractedGraph via AI.GENERATE on agent_events.
OntologyMaterializer.create_tables() — CREATE TABLE IF NOT EXISTS per entity/relationship table. Idempotent against pre-existing tables.
OntologyMaterializer.materialize(...) — staging-table → DELETE-by-session → INSERT FROM staging. Non-destructive, schema-aware via bq_client.get_table(...).schema.
OntologyPropertyGraphCompiler.compile_property_graph_ddl(...) then execute → CREATE OR REPLACE PROPERTY GRAPH (ontology_property_graph.py:307).
Phase 5 is destructive against a user-defined property graph. A user with their own CREATE PROPERTY GRAPH DDL — managed by Terraform, dbt, or hand-authored to express graph-object features the SDK doesn't generate yet — gets it overwritten on every ontology-build run.
The clean workaround today is dropping into Python and calling OntologyGraphManager.extract_graph(...) + OntologyMaterializer.materialize(...) directly, bypassing the orchestrator. That works but loses the CLI surface (--ontology, --binding, --session-ids, format output, error handling).
Proposed change
One new flag on the ontology-build command in cli.py:1200:
bq-agent-sdk ontology-build \
--ontology my.ontology.yaml \
--binding my-bq-prod.binding.yaml \
--session-ids s1,s2,s3 \
--project-id my-project \
--dataset-id my-dataset \
--skip-property-graph
When --skip-property-graph is set:
- Phase 5 is not invoked.
- The result dict reports
property_graph_created: False with skipped_reason: "user_requested" (distinct from today's False which means attempted-and-failed).
- The CLI's printed/JSON output also exposes a disambiguating field. Today's CLI output (the curated dict around
cli.py:1266–1274) only includes property_graph_created. Add a property_graph_status field with one of "created", "failed", or "skipped:user_requested". Without this, JSON consumers see property_graph_created: false with exit 0 and no signal explaining why — locally consistent with the result dict but user-visibly ambiguous.
- The CLI exit-1 branch at
cli.py:1277–1284 (which today raises typer.Exit(code=1) whenever property_graph_created is False) must be updated to short-circuit when result.get("skipped_reason") == "user_requested". In that case, exit 0 with no error message. The "Property Graph creation failed" message stays for the attempted-and-failed branch.
The default stays False to preserve current behavior.
Implementation sketch
build_ontology_graph gains a skip_property_graph: bool = False parameter. The phase-5 block becomes:
if skip_property_graph:
property_graph_created = False
skipped_reason = "user_requested"
else:
# existing CREATE OR REPLACE PROPERTY GRAPH path
...
The CLI threads the flag through. Output dict gains skipped_reason only when populated.
Acceptance criteria
Out of scope
- Validating that the user's pre-existing property graph is consistent with the ontology+binding the SDK is materializing into. That is a separate pre-flight concern — see the companion issue for binding-vs-physical-schema validation.
- Skipping phase 3 (
create_tables). It's already a no-op against pre-existing tables via CREATE TABLE IF NOT EXISTS (ontology_materializer.py:207, 210, 221, 241). No flag needed.
- Auto-deriving an ontology+binding from a user's existing property graph DDL.
Related
Effort
~0.5 eng-day. CLI threading + one orchestrator branch + tests.
Feat:
bq-agent-sdk ontology-build --skip-property-graphto populate base tables onlyGoal
Add an opt-in flag on the
ontology-buildCLI that runs phases 1–4 (load spec → extract → create tables → materialize) and stops before phase 5 (CREATE OR REPLACE PROPERTY GRAPH). Lets users with a pre-defined BigQuery property graph populate its base tables from BQ AA traces without overwriting their graph DDL on every run.Motivation
build_ontology_graph(...)(ontology_orchestrator.py:291) runs five phases in sequence:ExtractedGraphviaAI.GENERATEonagent_events.OntologyMaterializer.create_tables()—CREATE TABLE IF NOT EXISTSper entity/relationship table. Idempotent against pre-existing tables.OntologyMaterializer.materialize(...)— staging-table → DELETE-by-session → INSERT FROM staging. Non-destructive, schema-aware viabq_client.get_table(...).schema.OntologyPropertyGraphCompiler.compile_property_graph_ddl(...)then execute →CREATE OR REPLACE PROPERTY GRAPH(ontology_property_graph.py:307).Phase 5 is destructive against a user-defined property graph. A user with their own
CREATE PROPERTY GRAPHDDL — managed by Terraform, dbt, or hand-authored to express graph-object features the SDK doesn't generate yet — gets it overwritten on everyontology-buildrun.The clean workaround today is dropping into Python and calling
OntologyGraphManager.extract_graph(...)+OntologyMaterializer.materialize(...)directly, bypassing the orchestrator. That works but loses the CLI surface (--ontology,--binding,--session-ids, format output, error handling).Proposed change
One new flag on the
ontology-buildcommand incli.py:1200:When
--skip-property-graphis set:property_graph_created: Falsewithskipped_reason: "user_requested"(distinct from today'sFalsewhich means attempted-and-failed).cli.py:1266–1274) only includesproperty_graph_created. Add aproperty_graph_statusfield with one of"created","failed", or"skipped:user_requested". Without this, JSON consumers seeproperty_graph_created: falsewith exit 0 and no signal explaining why — locally consistent with the result dict but user-visibly ambiguous.cli.py:1277–1284(which today raisestyper.Exit(code=1)wheneverproperty_graph_createdisFalse) must be updated to short-circuit whenresult.get("skipped_reason") == "user_requested". In that case, exit 0 with no error message. The "Property Graph creation failed" message stays for the attempted-and-failed branch.The default stays
Falseto preserve current behavior.Implementation sketch
build_ontology_graphgains askip_property_graph: bool = Falseparameter. The phase-5 block becomes:The CLI threads the flag through. Output dict gains
skipped_reasononly when populated.Acceptance criteria
bq-agent-sdk ontology-build --skip-property-graph ...populates entity/relationship tables and exits 0 without invokingCREATE OR REPLACE PROPERTY GRAPH.cli.py:1277is updated: whenproperty_graph_createdisFalseandskipped_reason == "user_requested", the CLI exits 0 with no error printed. Whenproperty_graph_createdisFalsefor any other reason, the existing exit-1-with-error-message behavior is preserved.cli.py:1266–1274) gains aproperty_graph_statusfield with values"created","failed", or"skipped:user_requested"— bothtextandjsonformats expose it. Test asserts JSON consumers can distinguish skipped from failed without reading stderr.OntologyPropertyGraphCompileris not constructed when the flag is set.RUN_LIVE_BIGQUERY_TESTS=1, matching the existing ontology integration test pattern attests/test_integration_ontology_binding.py:44) creates a pre-existing property graph, runsontology-build --skip-property-graphagainst pre-existing base tables, and verifies the user's graph definition is unchanged after the run.docs/ontology/(or wherever the orchestrator is documented) explains the flag and the use case.Out of scope
create_tables). It's already a no-op against pre-existing tables viaCREATE TABLE IF NOT EXISTS(ontology_materializer.py:207, 210, 221, 241). No flag needed.Related
Effort
~0.5 eng-day. CLI threading + one orchestrator branch + tests.