Name: kyujinpy/Sakura-SOLRCA-Instruct-DPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: kyujinpy

Model Overview

Sakura-SOLRCA-Instruct-DPO is a 10.7 billion parameter instruction-tuned language model developed by Kyujin Han (kyujinpy) in collaboration with the LLM research consortium of Media Group Saramgwasup and Marker. This model was fine-tuned using the Direct Preference Optimization (DPO) method, leveraging the high-quality Intel/orca_dpo_pairs dataset to enhance its instruction-following capabilities.

Key Capabilities & Performance

The model exhibits robust performance across a range of benchmarks, as evaluated on the Hugging Face Open LLM Leaderboard. It achieved an average score of 74.05, with notable results in:

AI2 Reasoning Challenge (ARC): 71.16
HellaSwag: 88.49
MMLU: 66.17
TruthfulQA: 72.10
Winogrande: 82.95
GSM8K: 63.46

These scores indicate strong general reasoning, common sense, and instruction-following abilities. The model's development details, including training and code, are openly shared in the Sakura-SOLAR GitHub repository.

When to Use This Model

Sakura-SOLRCA-Instruct-DPO is suitable for applications requiring a capable instruction-following model of its size. Its balanced performance across various benchmarks makes it a strong candidate for:

General-purpose conversational AI.
Reasoning and question-answering tasks.
Applications where a 10.7B parameter model offers a good balance between performance and computational efficiency.

Overview

Model Overview

Key Capabilities & Performance

When to Use This Model

Full Model Card (README)