Crimson_Dawn-v0.2 Overview
Epiculous's Crimson_Dawn-v0.2 is a 12 billion parameter language model built upon the Mistral-Nemo-Base-2407 architecture. This iteration significantly expands on the training methodology of v0.1 by incorporating substantially more data and utilizing RSLoRA for fine-tuning, a key departure from the previous regular LoRA approach. A notable change is its training on the ChatML format, moving away from Mistral Formatting, which dictates its prompting structure.
Key Capabilities & Training
- Base Model: Derived from Mistral-Nemo-Base-2407.
- Training Methodology: Employs RSLoRA for fine-tuning, with a two-phased approach over two epochs each on RP data and then instruct data.
- Prompting Format: Optimized for ChatML, requiring specific
<|im_start|>userand<|im_end|>tags for effective interaction. - Context Length: Supports a substantial context window of 32768 tokens.
Performance Metrics
Evaluations on the Open LLM Leaderboard show an average score of 14.82. Specific metrics include:
- IFEval (0-Shot): 31.03
- BBH (3-Shot): 21.69
- MMLU-PRO (5-shot): 19.10
Use Cases
Crimson_Dawn-v0.2 is well-suited for applications requiring instruction-following and conversational AI, particularly where adherence to the ChatML format is feasible. Its enhanced training data and RSLoRA fine-tuning aim to improve its ability to understand and generate responses in a structured chat environment.