Overview
Magnum-v4-9b is a 9 billion parameter language model developed by Anthracite, built upon a ChatML-formatted Gemma 2 9B base model. Its primary objective is to emulate the sophisticated prose quality found in Claude 3 models, such as Sonnet and Opus. The model was fine-tuned using a diverse collection of datasets, including several Claude-instruct and synthetic roleplay datasets, to achieve its specialized writing style.
Key Capabilities
- Claude 3 Prose Replication: Specifically engineered to generate text with a similar quality and nuance to Claude 3 models.
- Extended Context Window: Supports a context length of 16384 tokens, allowing for more coherent and detailed long-form generations.
- ChatML Formatting: Utilizes the ChatML format for prompting, ensuring compatibility with common conversational AI frameworks like SillyTavern.
- Robust Training: Fine-tuned over two epochs on 8 NVIDIA H100 GPUs, leveraging a comprehensive dataset suite focused on instruction following and creative writing.
Good For
- Creative Writing and Roleplay: Excels in scenarios requiring detailed narrative generation and character interaction.
- Sophisticated Text Generation: Ideal for applications where high-quality, nuanced, and human-like prose is critical.
- Instruction Following: Benefits from extensive instruction-tuned datasets, making it proficient in adhering to complex prompts.
Limitations
While designed for high prose quality, the model's current Open LLM Leaderboard evaluation results indicate an average score of 23.56, with specific scores like IFEval (0-Shot) at 35.03 and MATH Lvl 5 (4-Shot) at 11.63. Users should consider these benchmarks for tasks requiring strong logical reasoning or mathematical capabilities.