Crimson_Dawn-v0.2 is a 12 billion parameter language model developed by Epiculous, based on Mistral-Nemo-Base-2407, and fine-tuned with RSLoRA. This model features a 32768 token context length and is specifically trained using the ChatML format for conversational and instructional tasks. It incorporates significantly more training data than its predecessor, focusing on improved performance in instruction following.
Crimson_Dawn-v0.2 Overview
Epiculous's Crimson_Dawn-v0.2 is a 12 billion parameter language model built upon the Mistral-Nemo-Base-2407 architecture. This iteration significantly expands on the training methodology of v0.1 by incorporating substantially more data and utilizing RSLoRA for fine-tuning, a key departure from the previous regular LoRA approach. A notable change is its training on the ChatML format, moving away from Mistral Formatting, which dictates its prompting structure.
Key Capabilities & Training
- Base Model: Derived from Mistral-Nemo-Base-2407.
- Training Methodology: Employs RSLoRA for fine-tuning, with a two-phased approach over two epochs each on RP data and then instruct data.
- Prompting Format: Optimized for ChatML, requiring specific
<|im_start|>userand<|im_end|>tags for effective interaction. - Context Length: Supports a substantial context window of 32768 tokens.
Performance Metrics
Evaluations on the Open LLM Leaderboard show an average score of 14.82. Specific metrics include:
- IFEval (0-Shot): 31.03
- BBH (3-Shot): 21.69
- MMLU-PRO (5-shot): 19.10
Use Cases
Crimson_Dawn-v0.2 is well-suited for applications requiring instruction-following and conversational AI, particularly where adherence to the ChatML format is feasible. Its enhanced training data and RSLoRA fine-tuning aim to improve its ability to understand and generate responses in a structured chat environment.