Epiculous/Crimson_Dawn-v0.2

Warm
Public
12B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Crimson_Dawn-v0.2 Overview

Epiculous's Crimson_Dawn-v0.2 is a 12 billion parameter language model built upon the Mistral-Nemo-Base-2407 architecture. This iteration significantly expands on the training methodology of v0.1 by incorporating substantially more data and utilizing RSLoRA for fine-tuning, a key departure from the previous regular LoRA approach. A notable change is its training on the ChatML format, moving away from Mistral Formatting, which dictates its prompting structure.

Key Capabilities & Training

  • Base Model: Derived from Mistral-Nemo-Base-2407.
  • Training Methodology: Employs RSLoRA for fine-tuning, with a two-phased approach over two epochs each on RP data and then instruct data.
  • Prompting Format: Optimized for ChatML, requiring specific <|im_start|>user and <|im_end|> tags for effective interaction.
  • Context Length: Supports a substantial context window of 32768 tokens.

Performance Metrics

Evaluations on the Open LLM Leaderboard show an average score of 14.82. Specific metrics include:

  • IFEval (0-Shot): 31.03
  • BBH (3-Shot): 21.69
  • MMLU-PRO (5-shot): 19.10

Use Cases

Crimson_Dawn-v0.2 is well-suited for applications requiring instruction-following and conversational AI, particularly where adherence to the ChatML format is feasible. Its enhanced training data and RSLoRA fine-tuning aim to improve its ability to understand and generate responses in a structured chat environment.