The dddsaty/SOLAR_Merge_Adapter_DPO_Orca is a 10.7 billion parameter language model created by dddsaty, built by merging two base SOLAR models and applying DPO fine-tuning. This model leverages the strengths of upstage/SOLAR-10.7B-Instruct-v1.0 and beomi/OPEN-SOLAR-KO-10.7B, further refined using the Intel/orca_dpo_pairs dataset. It demonstrates competitive performance across various benchmarks, including ARC, HellaSwag, MMLU, and GSM8K, making it suitable for general-purpose instruction-following tasks.
Loading preview...
Model Overview
The dddsaty/SOLAR_Merge_Adapter_DPO_Orca is a 10.7 billion parameter language model developed by dddsaty. It is constructed through a multi-stage process:
- Base Model Merging: Two base models, upstage/SOLAR-10.7B-Instruct-v1.0 and beomi/OPEN-SOLAR-KO-10.7B, were merged using the
mergekit(slerp) method. - DPO Fine-tuning: The merged model then underwent Direct Preference Optimization (DPO) using the Intel/orca_dpo_pairs training corpus. Only the adapter part was saved during this stage.
- Final Merge: The DPO adapter was subsequently merged back into the base merged model to create the final version.
Performance Benchmarks
This model exhibits solid performance across a range of academic benchmarks, with an average score of 65.96. Key scores include:
- ARC: 63.91
- HellaSwag: 84.58
- MMLU: 63.18
- TruthfulQA: 51.49
- Winogrande: 82
- GSM8K: 50.57
Intended Use Cases
Given its architecture and DPO fine-tuning, this model is well-suited for general instruction-following tasks, leveraging the combined strengths of its base models and the preference alignment from the Orca DPO dataset. Its 4096-token context length supports a variety of conversational and text generation applications.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.