flammenai/Flammades-Mistral-Nemo-12B

Warm
Public
12B
FP8
32768
License: apache-2.0
Hugging Face
Overview

Flammades-Mistral-Nemo-12B Overview

Flammades-Mistral-Nemo-12B is a 12 billion parameter language model developed by flammenai. It is a fine-tuned version of the nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B-v2 base model, leveraging a substantial 32768-token context window.

Key Characteristics

  • Base Model: Built on nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B-v2.
  • Fine-tuning Method: Utilizes the ORPO (Odds Ratio Preference Optimization) technique, trained over 3 epochs using dual RTX 3090 GPUs.
  • Training Data: Fine-tuned on specific DPO datasets: flammenai/Date-DPO-NoAsterisks and jondurbin/truthy-dpo-v0.1, aimed at improving conversational quality and factual alignment.

Performance Insights

Evaluations on the Open LLM Leaderboard indicate a balanced performance across various benchmarks. Notable scores include:

  • Avg.: 22.34
  • IFEval (0-Shot): 38.42
  • BBH (3-Shot): 32.39
  • MMLU-PRO (5-shot): 29.57

These metrics suggest its suitability for tasks requiring general reasoning and instruction following, benefiting from its DPO alignment.

Use Cases

This model is well-suited for applications requiring robust text generation, conversational AI, and tasks where alignment with human preferences and truthfulness are important considerations, thanks to its ORPO fine-tuning on specific DPO datasets.