Featherless-Chat-Models/SOLAR-10.7B-Instruct-v1.0
SOLAR-10.7B-Instruct-v1.0 is a 10.7 billion parameter instruction-tuned large language model developed by Upstage, built upon the SOLAR-10.7B base model. It utilizes a depth up-scaling (DUS) methodology, integrating Mistral 7B weights and continued pretraining to achieve superior performance. This model is specifically fine-tuned for single-turn conversational tasks, demonstrating strong capabilities across various NLP benchmarks, often outperforming larger models.
Loading preview...
Overview
SOLAR-10.7B-Instruct-v1.0 is a 10.7 billion parameter instruction-tuned large language model developed by Upstage. It is a fine-tuned version of the SOLAR-10.7B base model, optimized for single-turn conversations. The model leverages a novel "depth up-scaling" (DUS) methodology, which involves architectural modifications and continued pretraining, including the integration of Mistral 7B weights.
Key Capabilities & Performance
- Compact yet Powerful: Despite its 10.7B parameters, it demonstrates performance competitive with, and often superior to, models up to 30B parameters, including Mixtral 8x7B on certain benchmarks.
- Instruction Fine-Tuning: Utilizes state-of-the-art instruction fine-tuning methods, including Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO), on a diverse dataset mixture.
- Robustness: Designed for adaptability and robustness, making it an ideal choice for further fine-tuning.
- Data Integrity: Rigorous data contamination tests confirm the model's integrity, with benchmark-related datasets explicitly excluded from training.
Ideal Use Cases
- Single-Turn Conversation: Primarily designed and optimized for single-turn conversational interactions.
- Fine-Tuning Base: Its robust and adaptable nature makes it suitable as a base for further specialized fine-tuning for specific applications.