|
| 1 | +# Autoresearch on Ocean Network |
| 2 | + |
| 3 | +Autonomous ML research agent that iteratively improves a GPT pretraining script to minimize validation bits-per-byte (val_bpb). Inspired by [Karpathy's autoresearch](https://github.com/karpathy/autoresearch). |
| 4 | + |
| 5 | +The key difference: everything runs **inside a single Docker container** on an [Ocean](https://dashboard.oncompute.ai/) GPU node (H200, 141GB VRAM) with a **local open-source LLM** — no API keys needed. |
| 6 | + |
| 7 | +## How It Works |
| 8 | + |
| 9 | +1. **Data prep** — Downloads HuggingFace data shards, trains a BPE tokenizer (`prepare.py`) |
| 10 | +2. **Load agent LLM** — Qwen3-32B-AWQ via vLLM (~18GB VRAM, stays resident) |
| 11 | +3. **Baseline run** — Runs the original `train.py` (5-min training budget), records val_bpb |
| 12 | +4. **Agent loop** (up to 200 iterations): |
| 13 | + - LLM reads experiment history + current best `train.py` |
| 14 | + - Generates a hypothesis + complete new `train.py` |
| 15 | + - Syntax check → train (5 min) → evaluate val_bpb |
| 16 | + - If improved: keep. If not: revert to best. |
| 17 | + - `results.json` saved after every iteration |
| 18 | + |
| 19 | +The user extracts `results["best"]["train_py"]` to get the winning code. |
| 20 | + |
| 21 | +## Files |
| 22 | + |
| 23 | +| File | Description | |
| 24 | +|------|-------------| |
| 25 | +| `algo.py` | Core agent loop — orchestrates LLM inference and training | |
| 26 | +| `train.py` | GPT pretraining script (the file the agent modifies) | |
| 27 | +| `prepare.py` | Data download, tokenizer, dataloader, evaluation (read-only) | |
| 28 | +| `program.md` | Instructions for the agent LLM | |
| 29 | +| `Dockerfile` | Container build (CUDA 12.8, Python, PyTorch, vLLM) | |
| 30 | +| `plot_progress.py` | Generate progress charts from results | |
| 31 | + |
| 32 | +## Usage |
| 33 | + |
| 34 | +1. Go to [dashboard.oncompute.ai](https://dashboard.oncompute.ai/) |
| 35 | +2. Select an **H200 GPU** environment |
| 36 | +3. Configure the job and add payment |
| 37 | +4. Open the **Ocean Orchestrator** in VS Code / your editor |
| 38 | +5. Open this directory in the orchestrator and run the job — the container builds and executes `algo.py` autonomously |
| 39 | +6. Download `results.json` from the outputs when complete |
| 40 | + |
| 41 | +To plot results after a run: |
| 42 | +```bash |
| 43 | +python plot_progress.py path/to/results.json progress.png |
| 44 | +``` |
| 45 | + |
| 46 | +## Results |
| 47 | + |
| 48 | +### Qwen3-32B-AWQ — First Run |
| 49 | + |
| 50 | + |
| 51 | + |
| 52 | +- **Baseline**: 1.0077 val_bpb |
| 53 | +- **Best**: 0.9818 val_bpb (2.6% improvement) |
| 54 | +- **201 iterations** over 5.5 hours, 30 successful runs (85% crash rate) |
| 55 | +- Key improvements: increased model depth (8→10 layers), late-stage hyperparameter tuning |
0 commit comments