Ghanibhuti/Musician-Llama-3.2-1B-Instruct
Ghanibhuti/Musician-Llama-3.2-1B-Instruct is a 1 billion parameter Llama 3.2-based instruction-tuned model developed by Ghanibhuti. This specialized model is fine-tuned for text-to-MIDI music generation, converting natural language descriptions into MIDI token sequences. It excels at understanding musical concepts, genres, instruments, and styles to facilitate creative music composition.
Loading preview...
Overview
Musician-Llama-3.2-1B-Instruct, developed by Ghanibhuti, is a 1 billion parameter model fine-tuned from Llama 3.2-1B-Instruct. Its core function is to transform natural language descriptions of music into MIDI token sequences, acting as a specialized music AI assistant. The model is optimized for understanding various musical concepts, genres, instruments, and styles.
Key Capabilities
- Text-to-MIDI Generation: Converts text descriptions into pipe-separated MIDI token sequences.
- Musical Understanding: Comprehends diverse musical elements like genres (Jazz, Electronic, Classical), tempos (40-250 BPM), instruments (Piano, Drums, Synth), and moods (Happy, Sad).
- Optimized Performance: Utilizes 4-bit NF4 quantization with double quantization for efficient inference.
- Custom Fine-tuning: Trained on a custom MIDI-caption paired dataset using Supervised Fine-Tuning (SFT) with a maximum sequence length of 4096 tokens.
Use Cases
- Creative Music Composition: Generate musical ideas and structures from simple text prompts.
- Music Prototyping: Quickly create MIDI sequences for different styles and instruments.
- Educational Tools: Explore music theory and composition by describing desired musical outcomes.
Limitations
- Output requires post-processing to convert MIDI tokens into playable MIDI files.
- Limited to a maximum sequence length of 4096 tokens.
- Quality of output is highly dependent on the specificity and clarity of the input description.
- May occasionally generate unusual pitch combinations.