eren23/OGNO-7b-dpo-truthful

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 16, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

eren23/OGNO-7b-dpo-truthful is a 7 billion parameter language model, DPO fine-tuned from paulml/OGNO-7B, which is a Mistral 7B variant. This model is specifically optimized for truthfulness, achieving 76.61% on TruthfulQA (0-shot). It demonstrates strong general reasoning capabilities with an average score of 76.14 across various benchmarks, making it suitable for applications requiring factual accuracy and robust understanding.

Loading preview...

Model Overview

eren23/OGNO-7b-dpo-truthful is an experimental 7 billion parameter language model, fine-tuned using Direct Preference Optimization (DPO) on the jondurbin/truthy-dpo-v0.1 dataset. It is based on paulml/OGNO-7B, which itself is a variant of the Mistral 7B architecture.

Key Capabilities & Performance

This model is notably optimized for generating truthful responses, as evidenced by its performance on the TruthfulQA benchmark. Its evaluation on the Open LLM Leaderboard highlights a strong overall performance:

  • Avg. Score: 76.14
  • TruthfulQA (0-shot): 76.61%
  • AI2 Reasoning Challenge (25-Shot): 72.95%
  • HellaSwag (10-Shot): 89.02%
  • MMLU (5-Shot): 64.61%
  • Winogrande (5-shot): 84.69%
  • GSM8k (5-shot): 68.99%

Use Cases

Given its DPO fine-tuning for truthfulness and solid performance across reasoning and common sense benchmarks, this model is particularly well-suited for:

  • Applications where factual accuracy is critical.
  • Tasks requiring robust reasoning and understanding.
  • Experimental deployments for evaluating DPO-tuned models in truth-oriented scenarios.

While currently an experimental release, its focus on truthfulness makes it a valuable candidate for research and development in areas demanding reliable information generation.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p