Epiculous/Azure_Dusk-v0.2

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Sep 9, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Epiculous/Azure_Dusk-v0.2 is a 12 billion parameter language model based on Mistral-Nemo-Base-2407, developed by Epiculous. This iteration features significantly more training data and utilizes RSLoRA, with training specifically on the ChatML format. It is designed for general language tasks, building upon its predecessor Crimson_Dawn-v0.2.

Loading preview...

Azure_Dusk-v0.2 Overview

Azure_Dusk-v0.2 is a 12 billion parameter language model developed by Epiculous, building upon the Mistral-Nemo-Base-2407 architecture. This version represents a significant advancement over its predecessor, Crimson_Dawn-v0.2, incorporating a substantially larger training dataset and leveraging RSLoRA for improved efficiency and performance during fine-tuning.

Key Enhancements & Training Details

  • Base Model: Built on Mistral-Nemo-Base-2407.
  • Training Methodology: Utilizes RSLoRA (as opposed to standard LoRA in previous versions) and incorporates a two-phased training approach over two epochs each on RP data and instruct data.
  • Prompting Format: Trained exclusively on ChatML format, which is crucial for optimal interaction and performance.
  • Hardware: Training was conducted on two NVIDIA A6000 GPUs.

Performance Metrics

Evaluations on the Open LLM Leaderboard show an average score of 14.03. Specific scores include:

  • IFEval (0-Shot): 34.67
  • BBH (3-Shot): 17.40
  • MMLU-PRO (5-shot): 22.60

Usage and Prompting

Users should adhere strictly to the ChatML prompting structure for best results. Example:

<|im_start|>user
Hi there!<|im_end|>
<|im_start|>assistant
Nice to meet you!<|im_end|>

Quantized versions (exl2, gguf) are available for broader deployment.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p