You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: add Apple Silicon (MPS) support for macOS ARM64
Introduce a device abstraction layer (cosyvoice/utils/device.py) that
unifies CUDA, MPS, and CPU device management. Replace all hardcoded
CUDA-specific code paths in the inference pipeline with device-agnostic
alternatives, enabling CosyVoice to run natively on Apple Silicon Macs.
Key changes:
- Device abstraction: get_device(), get_stream_context(),
get_autocast_context(), empty_cache()
- model.py: Replace CUDA device init, streams, AMP, and cache clearing
across CosyVoiceModel, CosyVoice2Model, CosyVoice3Model
- cosyvoice.py: MPS-aware feature gates (TRT/vLLM require CUDA,
JIT/fp16 require any GPU)
- frontend.py: CoreMLExecutionProvider support for ONNX Runtime
- common.py: Guard torch.cuda.manual_seed_all for non-CUDA environments
- requirements.txt: Remove CUDA-only index URLs, loosen PyTorch version
- setup_macos.sh: One-command setup script for Apple Silicon
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Inference runs on MPS (Metal Performance Shaders) — faster than CPU
133
+
- TensorRT and vLLM are not available (CUDA-only)
134
+
- Training with DeepSpeed/DDP is not supported
135
+
- For CUDA environments (Linux), use `pip install -r requirements-cuda.txt` instead
136
+
111
137
### Model download
112
138
113
139
We strongly recommend that you download our pretrained `Fun-CosyVoice3-0.5B``CosyVoice2-0.5B``CosyVoice-300M``CosyVoice-300M-SFT``CosyVoice-300M-Instruct` model and `CosyVoice-ttsfrd` resource.
assert (self.__class__.__name__!='CosyVoiceModel') andnothasattr(self.llm, 'vllm'), 'streaming input text is only implemented for CosyVoice2/3 and do not support vllm!'
0 commit comments