Overview
Rei-12B: Claude 3 Prose Replication Model
Delta-Vector/Rei-12B is a 12 billion parameter language model specifically fine-tuned to emulate the prose quality found in the Claude 3 series, including Sonnet and Opus. Built upon a Mistral-Nemo-Instruct base, this model leverages a prototype Magnum V5 datamix for its training.
Key Capabilities & Features
- Claude 3 Prose Style: Engineered to replicate the nuanced and high-quality text generation characteristic of Claude 3 models.
- Extensive Context Window: Supports a 32768 token context length, allowing for detailed and long-form interactions.
- Optimized for Quality: Fine-tuned using a diverse set of ShareGPT-formatted datasets, including those focused on advanced pre-fills, roleplay, and instruction following, to enhance output quality.
- Prompting Flexibility: Recommends specific system prompts, such as Euryale's, to guide model behavior effectively.
Training Details
The model underwent 2 epochs of fine-tuning on 4x NVIDIA RTX 3090 GPUs, utilizing an Axolotl configuration with rSLORA for efficient training. It incorporates Liger plugins for rope, RMS norm, and layer norm, alongside flash attention for performance optimization.
Good For
- Creative Writing: Generating high-quality, descriptive, and engaging prose.
- Roleplay Scenarios: Maintaining consistent character personas and driving narrative forward.
- Complex Conversational AI: Handling intricate dialogues that require nuanced understanding and response generation.