Delta-Vector/Rei-V3-KTO-12B
Delta-Vector/Rei-V3-KTO-12B is a 12 billion parameter language model fine-tuned on the Rei-V3-12B-Base, designed to replicate the prose quality and coherency of Claude 3 models (Sonnet and Opus). This model utilizes a prototype Magnum V5 datamix and incorporates Reinforcement Learning (RL) for refinement. It is optimized for generating high-quality, coherent text with enhanced intelligence and prose style, making it suitable for creative writing and advanced conversational AI applications.
Loading preview...
Overview
Delta-Vector/Rei-V3-KTO-12B, or Rei-12B, is a 12 billion parameter language model developed by Delta-Vector. It is a refinement of the Rei-V3-12B-Base, specifically fine-tuned to enhance coherency, intelligence, and prose quality, aiming to replicate the sophisticated writing style of Claude 3 models (Opus and Sonnet). This model incorporates Reinforcement Learning (RL) and was trained using a prototype Magnum V5 datamix.
Key Capabilities
- Claude-like Prose: Designed to mimic the high-quality, coherent, and intelligent prose of Claude 3 models.
- Refined Base Model: Builds upon the Rei-V3-12B-Base, addressing 'sharp edges' and improving overall performance.
- ChatML Format: Utilizes the ChatML prompt format for structured conversations, with a recommended detailed system prompt for character-driven interactions.
Training Details
The model was trained for 1 epoch on 8x NVIDIA H100 GPUs. Hyperparameters included a gradient clip of 1e-4, optimized for Mistral-12B based models to prevent reward function flat-lining. The training process leveraged Axolotl.
Good For
- Applications requiring high-quality, nuanced text generation.
- Creative writing and role-playing scenarios where sophisticated prose is essential.
- Conversational AI systems aiming for human-like and coherent responses.