Commit b8fa30f
Rewrite snapshot v2: JSON-only, no cloudpickle, Rust tokenizer
Eliminates cloudpickle dependency entirely. New approach:
- Tokenizer: save_pretrained → load via Rust tokenizers lib (0.2s)
- Model config: JSON manifest (not pickled class objects)
- Tensors: mmap'd from safetensors files (0.07s)
- Model: reconstruct from config on meta device + load_state_dict
Benchmark results (Qwen2.5-7B, 15.2GB, RTX 4090):
Warm hydrate: 0.32s (was 0.42s with cloudpickle)
Cold hydrate: 1.40s (excluding torch import + inference)
Cold breakdown:
manifest: 0.02s
Rust tok: 0.21s
mmap tensors: 0.07s
import model: 1.08s (transformers.models.qwen2)
meta+load_sd: 0.01s
V1 cloudpickle snapshots still loadable via _hydrate_v1 fallback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>1 parent 5a56bf2 commit b8fa30f
3 files changed
Lines changed: 1026 additions & 412 deletions
0 commit comments