Overview
Flammades-Mistral-Nemo-12B Overview
Flammades-Mistral-Nemo-12B is a 12 billion parameter language model developed by flammenai. It is a fine-tuned version of the nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B-v2 base model, leveraging a substantial 32768-token context window.
Key Characteristics
- Base Model: Built on
nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B-v2. - Fine-tuning Method: Utilizes the ORPO (Odds Ratio Preference Optimization) technique, trained over 3 epochs using dual RTX 3090 GPUs.
- Training Data: Fine-tuned on specific DPO datasets:
flammenai/Date-DPO-NoAsterisksandjondurbin/truthy-dpo-v0.1, aimed at improving conversational quality and factual alignment.
Performance Insights
Evaluations on the Open LLM Leaderboard indicate a balanced performance across various benchmarks. Notable scores include:
- Avg.: 22.34
- IFEval (0-Shot): 38.42
- BBH (3-Shot): 32.39
- MMLU-PRO (5-shot): 29.57
These metrics suggest its suitability for tasks requiring general reasoning and instruction following, benefiting from its DPO alignment.
Use Cases
This model is well-suited for applications requiring robust text generation, conversational AI, and tasks where alignment with human preferences and truthfulness are important considerations, thanks to its ORPO fine-tuning on specific DPO datasets.