Epiculous/Azure_Dusk-v0.2
Epiculous/Azure_Dusk-v0.2 is a 12 billion parameter language model based on Mistral-Nemo-Base-2407, developed by Epiculous. This iteration features significantly more training data and utilizes RSLoRA, with training specifically on the ChatML format. It is designed for general language tasks, building upon its predecessor Crimson_Dawn-v0.2.
Loading preview...
Azure_Dusk-v0.2 Overview
Azure_Dusk-v0.2 is a 12 billion parameter language model developed by Epiculous, building upon the Mistral-Nemo-Base-2407 architecture. This version represents a significant advancement over its predecessor, Crimson_Dawn-v0.2, incorporating a substantially larger training dataset and leveraging RSLoRA for improved efficiency and performance during fine-tuning.
Key Enhancements & Training Details
- Base Model: Built on Mistral-Nemo-Base-2407.
- Training Methodology: Utilizes RSLoRA (as opposed to standard LoRA in previous versions) and incorporates a two-phased training approach over two epochs each on RP data and instruct data.
- Prompting Format: Trained exclusively on ChatML format, which is crucial for optimal interaction and performance.
- Hardware: Training was conducted on two NVIDIA A6000 GPUs.
Performance Metrics
Evaluations on the Open LLM Leaderboard show an average score of 14.03. Specific scores include:
- IFEval (0-Shot): 34.67
- BBH (3-Shot): 17.40
- MMLU-PRO (5-shot): 22.60
Usage and Prompting
Users should adhere strictly to the ChatML prompting structure for best results. Example:
<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>Quantized versions (exl2, gguf) are available for broader deployment.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.