macadeliccc/SOLAR-10.7b-Instruct-dpo
macadeliccc/SOLAR-10.7b-Instruct-dpo is a 10.7 billion parameter instruction-tuned causal language model, fine-tuned from upstage/SOLAR-10.7B-Instruct-v1.0 using the Intel/orca_dpo_pairs dataset. This model demonstrates improved performance over its base model, achieving a 72.79 EQ Bench score compared to the base model's 71.03. It is optimized for general instruction-following tasks and exhibits strong performance across various benchmarks including AGIEval, GPT4All, TruthfulQA, and Bigbench.
Loading preview...
Model Overview
macadeliccc/SOLAR-10.7b-Instruct-dpo is a 10.7 billion parameter instruction-tuned model, built upon the upstage/SOLAR-10.7B-Instruct-v1.0 architecture. This model has undergone further fine-tuning using the Intel/orca_dpo_pairs dataset, enhancing its instruction-following capabilities.
Key Capabilities & Performance
- Improved Instruction Following: Achieves an EQ Bench score of 72.79, surpassing the base model's 71.03, indicating better adherence to instructions.
- Benchmark Performance: Demonstrates solid performance across a range of academic benchmarks:
- AGIEval: 47.57%
- GPT4All: 74.3%
- TruthfulQA: 72.73%
- Bigbench: 45.76%
- Open LLM Leaderboard: Achieves an average score of 73.54 on the Open LLM Leaderboard, with notable scores in HellaSwag (88.08) and Winogrande (82.32).
- ChatML Template: Utilizes the ChatML chat template for structured conversations.
When to Use This Model
- General Instruction Following: Ideal for applications requiring a robust instruction-tuned model for various tasks.
- Enhanced Reasoning: Suitable for tasks benefiting from improved reasoning capabilities, as indicated by its benchmark scores.
- Research and Development: A strong candidate for further fine-tuning or as a base for specialized applications due to its enhanced performance over the original SOLAR-10.7B-Instruct-v1.0.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.