update inspiremusic

iris2c · iris2c · commit bd456c297708 · 2025-02-19T09:26:51.000+08:00
diff --git a/inspiremusic/index.html b/inspiremusic/index.html
@@ -44,8 +44,9 @@ <h2>InspireMusic: A Unified Framework for Controlled High-Fidelity Long-Form Mus
 	</div>
 	<p><b>Abstract</b>
 		We introduce <b>InspireMusic</b>, a unified framework designed to generate high-fidelity music, songs, and audio, which integrates an autoregressive transformer with a super-resolution flow-matching model.
-		This framework enables the direct generation of high-fidelity long-form audio at 48kHz from both text and audio modalities. Our model differs from previous approaches, we utilize dual audio tokenizers: a high-bitrate compression audio tokenizer contains richer semantic information,
+		This framework enables to generate high-fidelity long-form audio at 48kHz from both text and audio modalities. Our model differs from previous approaches, we utilize dual audio tokenizers: a high-bitrate compression audio tokenizer contains richer semantic information,
 		thereby reducing training costs and enhancing efficiency, and an acoustic codec that preserves fine-grained acoustic details during flow-matching model training. This combination enables us to achieve high-quality audio generation with long-form coherence.
+		Then an autoregressive transformer model based on Qwen2.5 to predict 75Hz audio tokens. Next, we employ a super resolution flow matching model to learn the latent features of the audio from 150Hz music tokenzier, and finally, we output high-quality audio waveforms through a Vocoder. This framework represents a significant advancement in music generation by directly modeling raw audio, ensuring both diversity and high-fidelity output.
 	</p>
 	</p>