eren23/OGNO-7b-dpo-truthful
eren23/OGNO-7b-dpo-truthful is a 7 billion parameter language model, DPO fine-tuned from paulml/OGNO-7B, which is a Mistral 7B variant. This model is specifically optimized for truthfulness, achieving 76.61% on TruthfulQA (0-shot). It demonstrates strong general reasoning capabilities with an average score of 76.14 across various benchmarks, making it suitable for applications requiring factual accuracy and robust understanding.
Loading preview...
Model Overview
eren23/OGNO-7b-dpo-truthful is an experimental 7 billion parameter language model, fine-tuned using Direct Preference Optimization (DPO) on the jondurbin/truthy-dpo-v0.1 dataset. It is based on paulml/OGNO-7B, which itself is a variant of the Mistral 7B architecture.
Key Capabilities & Performance
This model is notably optimized for generating truthful responses, as evidenced by its performance on the TruthfulQA benchmark. Its evaluation on the Open LLM Leaderboard highlights a strong overall performance:
- Avg. Score: 76.14
- TruthfulQA (0-shot): 76.61%
- AI2 Reasoning Challenge (25-Shot): 72.95%
- HellaSwag (10-Shot): 89.02%
- MMLU (5-Shot): 64.61%
- Winogrande (5-shot): 84.69%
- GSM8k (5-shot): 68.99%
Use Cases
Given its DPO fine-tuning for truthfulness and solid performance across reasoning and common sense benchmarks, this model is particularly well-suited for:
- Applications where factual accuracy is critical.
- Tasks requiring robust reasoning and understanding.
- Experimental deployments for evaluating DPO-tuned models in truth-oriented scenarios.
While currently an experimental release, its focus on truthfulness makes it a valuable candidate for research and development in areas demanding reliable information generation.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.