kyujinpy/Sakura-SOLRCA-Instruct-DPO
Sakura-SOLRCA-Instruct-DPO is a 10.7 billion parameter instruction-tuned causal language model developed by Kyujin Han and the LLM research consortium of Media Group Saramgwasup and Marker. This model, fine-tuned using the DPO method on the Intel/orca_dpo_pairs dataset, demonstrates strong performance across various benchmarks, achieving an average score of 74.05 on the Open LLM Leaderboard. It is designed for general-purpose instruction following and reasoning tasks, offering competitive capabilities for its size.
Loading preview...
Model Overview
Sakura-SOLRCA-Instruct-DPO is a 10.7 billion parameter instruction-tuned language model developed by Kyujin Han (kyujinpy) in collaboration with the LLM research consortium of Media Group Saramgwasup and Marker. This model was fine-tuned using the Direct Preference Optimization (DPO) method, leveraging the high-quality Intel/orca_dpo_pairs dataset to enhance its instruction-following capabilities.
Key Capabilities & Performance
The model exhibits robust performance across a range of benchmarks, as evaluated on the Hugging Face Open LLM Leaderboard. It achieved an average score of 74.05, with notable results in:
- AI2 Reasoning Challenge (ARC): 71.16
- HellaSwag: 88.49
- MMLU: 66.17
- TruthfulQA: 72.10
- Winogrande: 82.95
- GSM8K: 63.46
These scores indicate strong general reasoning, common sense, and instruction-following abilities. The model's development details, including training and code, are openly shared in the Sakura-SOLAR GitHub repository.
When to Use This Model
Sakura-SOLRCA-Instruct-DPO is suitable for applications requiring a capable instruction-following model of its size. Its balanced performance across various benchmarks makes it a strong candidate for:
- General-purpose conversational AI.
- Reasoning and question-answering tasks.
- Applications where a 10.7B parameter model offers a good balance between performance and computational efficiency.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.