Self-optimizing SQL query pipeline. Inspired by karpathy/autoresearch.
LLM rewrites query → run it → measure (speed + correctness) → keep if better → repeat
One file. Zero fluff.
Slow SQL queries are everywhere. Correlated subqueries, redundant scans, missing pre-aggregations — they kill app performance and nobody catches them until prod is on fire.
AutoSQL fixes this automatically:
- You give it a query
- It runs the query, records the baseline time and result hash
- An LLM rewrites the query trying to make it faster
- It runs the new query, checks the result is identical, measures the speedup
- Keeps the best version, feeds the history back into the next iteration
- Repeats until the budget is spent
# 1. Install dependencies
pip install openai # covers Groq, OpenAI, Ollama, Together, OpenRouter
pip install anthropic # only needed for Claude models
# 2. Set the API key for your chosen provider
export GROQ_API_KEY=gsk_... # Groq (free at https://console.groq.com)
export OPENAI_API_KEY=sk_... # OpenAI
export ANTHROPIC_API_KEY=sk-ant-... # Anthropic
# Ollama needs no key — just have it running locally
# 3. Optimize your query
python AutoSQL.py --query slow.sql --db myapp.sqlite
# OpenAI
python AutoSQL.py --query slow.sql --db myapp.sqlite --model gpt-4o --provider openai
# Anthropic / Claude
python AutoSQL.py --query slow.sql --db myapp.sqlite --model claude-opus-4-5 --provider anthropic
# Ollama (local)
python AutoSQL.py --query slow.sql --db myapp.sqlite --model llama3 --provider ollama
# Any OpenAI-compatible endpoint
python AutoSQL.py --query slow.sql --db myapp.sqlite --model my-model --base-url http://localhost:8080/v1
# More iterations or a different Groq model
python AutoSQL.py --query "SELECT * FROM ..." --db myapp.sqlite --iterations 15
python AutoSQL.py --query slow.sql --db myapp.sqlite --model llama-3.1-8b-instant──────────────────────────────────────────────────────────────
AutoSQL — Self-optimizing Query Pipeline
Model : llama-3.3-70b-versatile
Provider : groq
──────────────────────────────────────────────────────────────
Measuring baseline … 1842.3 ms
[01/10] Generating … ✓ 312.4 ms (5.89× faster) ← NEW BEST
[02/10] Generating … ✓ 298.1 ms (6.18× faster) ← NEW BEST
[03/10] Generating … ✗ wrong result
[04/10] Generating … ✓ 291.7 ms (6.32× faster) ← NEW BEST
...
──────────────────────────────────────────────────────────────
Final Report
──────────────────────────────────────────────────────────────
Baseline : 1842.3 ms
Best : 291.7 ms (6.32× faster)
AutoSQL tackles the classic anti-patterns that kill query performance:
| Anti-pattern | What AutoSQL does |
|---|---|
Correlated subquery in WHERE |
Replaces with pre-aggregated JOIN / CTE |
Correlated subquery in SELECT |
Replaces with pre-aggregated JOIN |
| Multiple passes over the same table | Collapses into a single scan |
AutoSQL never accepts a faster query that returns different results.
Every candidate is run and its output hashed. Only an exact match against the baseline hash counts as correct.
After the run, autosql_log.json contains every attempt:
{
"baseline_ms": 1842.3,
"best_ms": 291.7,
"speedup": 6.32,
"iterations": [
{ "iteration": 1, "time_ms": 312.4, "speedup": 5.89, "correct": true, "query": "..." },
...
]
}| Flag | Default | Description |
|---|---|---|
--query |
(required) | SQL string or .sql file path |
--db |
(required) | SQLite database path |
--iterations |
10 |
How many rewrites to attempt |
--model |
llama-3.3-70b-versatile |
LLM model to use |
--provider |
(auto-detect) | groq, openai, anthropic, ollama, together, openrouter |
--base-url |
(none) | Custom OpenAI-compatible endpoint URL |
--min-speedup |
0 (disabled) |
Stop early once this speedup multiplier is reached (e.g. 3.0) |
--timeout |
0 (unlimited) |
Per-execution timeout in milliseconds; aborts runaway queries |
AutoSQL intentionally stays minimal. Easy extension points:
- Other databases — swap
sqlite3forpsycopg2,duckdb, etc. Onlyrun_query()changes. - Other metrics — add memory, I/O reads, or query plan cost alongside wall-clock time.
- Indexes — let the LLM propose
CREATE INDEXstatements alongside the query rewrite. - Batch mode — feed a folder of
.sqlfiles, get a report for each.
- Python 3.10+
openai(pip install openai) — covers Groq, OpenAI, Ollama, Together, OpenRouteranthropic(pip install anthropic) — only for Claude/Anthropic models- At least one API key set as an environment variable (see Quickstart above)
- SQLite database
| Provider | --provider |
Key env var | Notes |
|---|---|---|---|
| Groq | groq |
GROQ_API_KEY |
Default; free tier available |
| OpenAI | openai |
OPENAI_API_KEY |
GPT-4o, o3, etc. |
| Anthropic | anthropic |
ANTHROPIC_API_KEY |
Claude models; needs pip install anthropic |
| Ollama | ollama |
(none) | Runs locally; no key required |
| Together AI | together |
TOGETHER_API_KEY |
|
| OpenRouter | openrouter |
OPENROUTER_API_KEY |
|
| Any OpenAI-compatible | openai |
OPENAI_API_KEY |
Pass --base-url to override endpoint |