FunAudioLLM/InspireMusic-1.5B

Warm
Public
1.5B
BF16
131072
Hugging Face
Overview

InspireMusic-1.5B: Unified Music, Song, and Audio Generation

InspireMusic-1.5B is a 1.5 billion parameter model from FunAudioLLM, built upon the Qwen2.5 autoregressive transformer architecture. It provides a unified framework for generating high-quality music, songs, and audio, integrating advanced audio tokenization with a super-resolution flow-matching model to produce detailed and coherent long-form audio.

Key Capabilities

  • High-Quality Audio Generation: Focuses on producing music, song, and audio with superior acoustic quality.
  • Long-Form Music Generation: Specifically designed to support the creation of extended musical pieces, up to several minutes in length.
  • Text-to-Music: Generates music from textual prompts.
  • Music Continuation: Extends existing audio prompts with new musical content.
  • Unified Framework: Combines audio tokenizers, an autoregressive transformer, and a flow-matching model for comprehensive generative tasks.
  • Controllable Generation: Supports generation guided by both text and audio prompts.

Good For

  • Developers and researchers focused on generative audio applications.
  • Creating custom music tracks from text descriptions.
  • Extending or completing existing musical segments.
  • Experimenting with AI-driven soundscape innovation and euphony enhancement.