Featherless-Chat-Models/SOLAR-10.7B-Instruct-v1.0

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:May 9, 2025License:cc-by-nc-4.0Architecture:Transformer Open Weights Warm

SOLAR-10.7B-Instruct-v1.0 is a 10.7 billion parameter instruction-tuned large language model developed by Upstage, built upon the SOLAR-10.7B base model. It utilizes a depth up-scaling (DUS) methodology, integrating Mistral 7B weights and continued pretraining to achieve superior performance. This model is specifically fine-tuned for single-turn conversational tasks, demonstrating strong capabilities across various NLP benchmarks, often outperforming larger models.

Loading preview...

Overview

SOLAR-10.7B-Instruct-v1.0 is a 10.7 billion parameter instruction-tuned large language model developed by Upstage. It is a fine-tuned version of the SOLAR-10.7B base model, optimized for single-turn conversations. The model leverages a novel "depth up-scaling" (DUS) methodology, which involves architectural modifications and continued pretraining, including the integration of Mistral 7B weights.

Key Capabilities & Performance

  • Compact yet Powerful: Despite its 10.7B parameters, it demonstrates performance competitive with, and often superior to, models up to 30B parameters, including Mixtral 8x7B on certain benchmarks.
  • Instruction Fine-Tuning: Utilizes state-of-the-art instruction fine-tuning methods, including Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO), on a diverse dataset mixture.
  • Robustness: Designed for adaptability and robustness, making it an ideal choice for further fine-tuning.
  • Data Integrity: Rigorous data contamination tests confirm the model's integrity, with benchmark-related datasets explicitly excluded from training.

Ideal Use Cases

  • Single-Turn Conversation: Primarily designed and optimized for single-turn conversational interactions.
  • Fine-Tuning Base: Its robust and adaptable nature makes it suitable as a base for further specialized fine-tuning for specific applications.