FunAudioLLM/InspireMusic-Base-24kHz

Warm
Public
0.5B
BF16
131072
Hugging Face
Overview

InspireMusic-Base-24kHz: Music Generation Model

InspireMusic-Base-24kHz is a 0.5 billion parameter model from the FunAudioLLM project, designed for high-quality music generation. It is built upon a unified framework that integrates audio tokenization with an autoregressive transformer (using Qwen2.5 as its backbone) and a super-resolution flow-matching model. This architecture enables the creation of coherent and contextually relevant audio sequences.

Key Capabilities

  • High-Quality Music Generation: Produces music with high audio fidelity.
  • Long-Form Audio Generation: Capable of generating extended musical pieces.
  • Text-to-Music: Generates music from descriptive text prompts.
  • Music Continuation: Extends existing audio prompts with new musical content.
  • Unified Framework: Provides both training and inference codes for AI-based generative models.
  • 24kHz Mono Audio Support: Specifically trained for 24kHz mono audio output.

Good for

  • Developers and researchers focused on music synthesis and audio AI.
  • Applications requiring text-to-music conversion.
  • Use cases involving the continuation or extension of musical segments.
  • Projects needing a robust framework for training and deploying music generation models.