flammenai/Flammades-Mistral-Nemo-12B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Oct 5, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Flammades-Mistral-Nemo-12B is a 12 billion parameter language model developed by flammenai, built upon the nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B-v2 base model. It was fine-tuned using ORPO on specific DPO datasets, including flammenai/Date-DPO-NoAsterisks and jondurbin/truthy-dpo-v0.1, to enhance its conversational and truthfulness capabilities. With a context length of 32768 tokens, this model is designed for general-purpose text generation and understanding, particularly benefiting from its DPO-tuned alignment.

Loading preview...

Flammades-Mistral-Nemo-12B Overview

Flammades-Mistral-Nemo-12B is a 12 billion parameter language model developed by flammenai. It is a fine-tuned version of the nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B-v2 base model, leveraging a substantial 32768-token context window.

Key Characteristics

  • Base Model: Built on nbeerbower/Mistral-Nemo-Gutenberg-Doppel-12B-v2.
  • Fine-tuning Method: Utilizes the ORPO (Odds Ratio Preference Optimization) technique, trained over 3 epochs using dual RTX 3090 GPUs.
  • Training Data: Fine-tuned on specific DPO datasets: flammenai/Date-DPO-NoAsterisks and jondurbin/truthy-dpo-v0.1, aimed at improving conversational quality and factual alignment.

Performance Insights

Evaluations on the Open LLM Leaderboard indicate a balanced performance across various benchmarks. Notable scores include:

  • Avg.: 22.34
  • IFEval (0-Shot): 38.42
  • BBH (3-Shot): 32.39
  • MMLU-PRO (5-shot): 29.57

These metrics suggest its suitability for tasks requiring general reasoning and instruction following, benefiting from its DPO alignment.

Use Cases

This model is well-suited for applications requiring robust text generation, conversational AI, and tasks where alignment with human preferences and truthfulness are important considerations, thanks to its ORPO fine-tuning on specific DPO datasets.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p