Skip to content

Commit fa0f857

Browse files
chore: update dataset prompt phrasing and add skills evaluation configs to run_config
1 parent e8237e8 commit fa0f857

2 files changed

Lines changed: 4 additions & 1 deletion

File tree

evals/dataset.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@
1717
{
1818
"id": "cloud-sql-schema-tables-explore",
1919
"starting_prompt": "I want to understand the structure of my database.",
20-
"conversation_plan": "First, ask the agent to list the databases in the instance. After the agent provides the databases, ask it to list the tables specifically for the database.",
20+
"conversation_plan": "First, ask the agent to list the databases in the instance. After the agent provides the databases, ask it to list the tables specifically for that database.",
2121
"expected_trajectory": [
2222
"list_databases",
2323
"list_tables"

evals/run_config.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,12 +25,15 @@ scorers:
2525
model_config: /workspace/evals/gemini_2.5_pro_model.yaml
2626
behavioral_metrics:
2727
model_config: /workspace/evals/gemini_2.5_pro_model.yaml
28+
skills_best_practices:
29+
model_config: /workspace/evals/gemini_2.5_pro_model.yaml
2830

2931
# Performance
3032
turn_count: {}
3133
end_to_end_latency: {}
3234
tool_call_latency: {}
3335
token_consumption: {}
36+
skills_trajectory: {}
3437

3538
reporting:
3639
bigquery:

0 commit comments

Comments
 (0)