nbeerbower/mistral-nemo-bophades-12B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Aug 13, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The nbeerbower/mistral-nemo-bophades-12B is a 12 billion parameter Mistral-Nemo-Instruct-2407 model fine-tuned on truthy-dpo-v0.1 and orca_math_dpo datasets. This model is optimized for improved truthfulness and mathematical reasoning, achieving an average score of 24.72 on the Open LLM Leaderboard. It is suitable for applications requiring enhanced factual accuracy and problem-solving capabilities.

Loading preview...

Model Overview

nbeerbower/mistral-nemo-bophades-12B is a fine-tuned variant of the Mistral-Nemo-Instruct-2407 base model. It has been specifically trained using DPO (Direct Preference Optimization) on two distinct datasets: jondurbin/truthy-dpo-v0.1 and kyujinpy/orca_math_dpo. This training approach aims to enhance the model's performance in areas related to factual accuracy and mathematical reasoning.

Key Capabilities & Performance

The model was fine-tuned for one epoch on an A100 GPU. Its performance has been evaluated on the Open LLM Leaderboard, where it achieved an average score of 24.72. Notable individual metric scores include:

  • IFEval (0-Shot): 67.94
  • BBH (3-Shot): 29.54
  • MATH Lvl 5 (4-Shot): 6.27
  • MMLU-PRO (5-shot): 27.79

Good For

  • Applications requiring improved truthfulness and reduced hallucination.
  • Tasks involving mathematical problem-solving and reasoning.
  • Use cases where a fine-tuned Mistral-Nemo variant with DPO benefits is desired.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p