macadeliccc/SOLAR-10.7b-Instruct-dpo

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Jan 24, 2024License:cc-by-nc-4.0Architecture:Transformer0.0K Open Weights Warm

macadeliccc/SOLAR-10.7b-Instruct-dpo is a 10.7 billion parameter instruction-tuned causal language model, fine-tuned from upstage/SOLAR-10.7B-Instruct-v1.0 using the Intel/orca_dpo_pairs dataset. This model demonstrates improved performance over its base model, achieving a 72.79 EQ Bench score compared to the base model's 71.03. It is optimized for general instruction-following tasks and exhibits strong performance across various benchmarks including AGIEval, GPT4All, TruthfulQA, and Bigbench.

Loading preview...

Model Overview

macadeliccc/SOLAR-10.7b-Instruct-dpo is a 10.7 billion parameter instruction-tuned model, built upon the upstage/SOLAR-10.7B-Instruct-v1.0 architecture. This model has undergone further fine-tuning using the Intel/orca_dpo_pairs dataset, enhancing its instruction-following capabilities.

Key Capabilities & Performance

  • Improved Instruction Following: Achieves an EQ Bench score of 72.79, surpassing the base model's 71.03, indicating better adherence to instructions.
  • Benchmark Performance: Demonstrates solid performance across a range of academic benchmarks:
    • AGIEval: 47.57%
    • GPT4All: 74.3%
    • TruthfulQA: 72.73%
    • Bigbench: 45.76%
  • Open LLM Leaderboard: Achieves an average score of 73.54 on the Open LLM Leaderboard, with notable scores in HellaSwag (88.08) and Winogrande (82.32).
  • ChatML Template: Utilizes the ChatML chat template for structured conversations.

When to Use This Model

  • General Instruction Following: Ideal for applications requiring a robust instruction-tuned model for various tasks.
  • Enhanced Reasoning: Suitable for tasks benefiting from improved reasoning capabilities, as indicated by its benchmark scores.
  • Research and Development: A strong candidate for further fine-tuning or as a base for specialized applications due to its enhanced performance over the original SOLAR-10.7B-Instruct-v1.0.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p