Overview
InspireMusic-1.5B: Unified Music, Song, and Audio Generation
InspireMusic-1.5B is a 1.5 billion parameter model from FunAudioLLM, built upon the Qwen2.5 autoregressive transformer architecture. It provides a unified framework for generating high-quality music, songs, and audio, integrating advanced audio tokenization with a super-resolution flow-matching model to produce detailed and coherent long-form audio.
Key Capabilities
- High-Quality Audio Generation: Focuses on producing music, song, and audio with superior acoustic quality.
- Long-Form Music Generation: Specifically designed to support the creation of extended musical pieces, up to several minutes in length.
- Text-to-Music: Generates music from textual prompts.
- Music Continuation: Extends existing audio prompts with new musical content.
- Unified Framework: Combines audio tokenizers, an autoregressive transformer, and a flow-matching model for comprehensive generative tasks.
- Controllable Generation: Supports generation guided by both text and audio prompts.
Good For
- Developers and researchers focused on generative audio applications.
- Creating custom music tracks from text descriptions.
- Extending or completing existing musical segments.
- Experimenting with AI-driven soundscape innovation and euphony enhancement.