AuriAetherwiing/G4-26B-A4B-Musica-v1
AuriAetherwiing/G4-26B-A4B-Musica-v1 is a 26 billion parameter Gemma-4-26B-A4B-it based language model, fine-tuned by AuriAetherwiing for roleplay, story generation, creative writing, and conversational tasks. This model, the third in the Musica series, offers strong creative prose and scenario generation, though it may exhibit some instability and inconsistency in instruction following compared to larger counterparts. It features a 32768 token context length and was trained with LoRA on a pretokenized dataset.
Loading preview...
Overview
AuriAetherwiing/G4-26B-A4B-Musica-v1 is a 26 billion parameter model built upon the Gemma-4-26B-A4B-it architecture, specifically fine-tuned for creative text generation. It represents the third iteration in the Musica series, focusing on enhancing roleplay, story generation, and general conversational abilities. While noted for its creative prose and imaginative scenario generation, the model's instruction following can be inconsistent, and it may be less stable or "smart" than larger models. It supports a substantial context length of 32768 tokens.
Key Capabilities
- Creative Writing: Excels in generating engaging prose for roleplay and story creation.
- Conversational Fluency: Designed for natural and dynamic dialogue.
- Refusal Handling: Does not implement refusals, allowing for broader content generation.
- Swipe Diversity: Offers good variety in generated responses.
Training Details
The model was trained using Axolotl with a r64a64 LoRA adapter over 1 epoch, utilizing a constant learning rate of 1e-5 with warmup. The training leveraged a pretokenized dataset, allura-forge/musica-sft-v1-gemma4-pretok, and was sponsored by ArliAI. Training graphs and statistics are available via the CometML Project.
Recommended Usage
For optimal performance, the developers recommend specific sampling parameters:
- Temperature: 1
- Min-P: 0.02
- NSigma: 2
It is advised to avoid using any form of repetition penalties, as they are noted to degrade output quality for this model.