Overview
Rei-V2-12B: Claude 3-like Prose Quality
Rei-V2-12B is a 12 billion parameter language model developed by Delta-Vector, building upon the Mistral-Nemo-Instruct (ChatML'ified) base. Originally an experiment in gradient clipping, its exceptional performance led to its official release. The model was fine-tuned using a prototype Magnum V5 datamix with the explicit goal of replicating the sophisticated prose quality found in Claude 3 models, particularly Sonnet and Opus.
Key Characteristics
- Base Model: Fine-tuned from Mistral-Nemo-Instruct.
- Prose Quality: Optimized for generating high-quality, nuanced text, aiming for a style similar to Claude 3 models.
- Context Length: Supports a substantial context window of 32768 tokens.
- Training Innovation: Experimented with gradient clipping (max_grad_norm) during training, with a 0.001 clip value identified as optimal to prevent overfitting or underfitting.
- Prompt Format: Utilizes the ChatML format for structured conversations.
Ideal Use Cases
- Creative Writing: Generating detailed narratives, stories, and descriptive text.
- Roleplay: Engaging in character-driven interactions and maintaining consistent personas.
- Prose Generation: Applications requiring sophisticated and high-quality textual output.
The model was trained for 2 epochs on 8x NVIDIA H200 GPUs, leveraging the Axolotl framework.