Name: FunAudioLLM/InspireMusic-1.5B-Long API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: FunAudioLLM

InspireMusic-1.5B-Long Overview

InspireMusic-1.5B-Long is a 1.5 billion parameter model developed by FunAudioLLM, specifically engineered for advanced music, song, and audio generation. It leverages a unified framework that combines audio tokenization with an autoregressive transformer, built upon the Qwen2.5 backbone, and a super-resolution flow-matching model. This architecture enables the creation of high-quality, long-form audio content, distinguishing it from models primarily focused on text or shorter audio segments.

Key Capabilities

Long-form Music Generation: Capable of generating coherent music pieces lasting several minutes.
High Audio Quality: Utilizes a super-resolution flow-matching model to enhance acoustic details and fidelity.
Unified Framework: Integrates audio tokenizers, an autoregressive transformer, and flow-matching for comprehensive audio generation.
Text-to-Music: Generates music from English text prompts.
Music Continuation: Extends existing audio prompts to create longer musical sequences.
Flexible Inference: Supports both a 'normal' mode with flow matching for higher quality and a 'fast' mode without for quicker generation, with varying GPU memory requirements (24GB for normal, 12GB for fast).

Good For

Developers and researchers focused on creating extended, high-fidelity musical compositions.
Applications requiring text-to-music synthesis or music continuation.
Experimenting with a unified framework for diverse audio generation tasks, including future support for song and general audio generation.