upstage/SOLAR-0-70b-16bit
The SOLAR-0-70b-16bit model by Upstage is a 69 billion parameter instruction-tuned causal language model, fine-tuned from the LLaMA-2 backbone. It is optimized for general language understanding and generation tasks, demonstrating strong performance across various benchmarks including ARC-Challenge, HellaSwag, MMLU, and TruthfulQA. This model is notable for its ability to handle longer input sequences, supporting over 10,000 tokens through dynamic rope scaling.
Loading preview...
Overview
Upstage's SOLAR-0-70b-16bit is a 69 billion parameter instruction-tuned language model, developed by fine-tuning the LLaMA-2 backbone. It is designed for general-purpose language tasks and has achieved a top ranking on the HuggingFace Open LLM Leaderboard, indicating strong performance relative to other open models.
Key Capabilities
- Instruction Following: Fine-tuned with Orca-style and Alpaca-style datasets for robust instruction adherence.
- Extended Context Handling: Utilizes
rope_scalingwith a dynamic factor of 2, enabling it to process input sequences exceeding 10,000 tokens. - Benchmark Performance: Achieves an average H4 score of 73, with specific scores of 71.1 on ARC-Challenge, 87.9 on HellaSwag, 70.6 on MMLU, and 62.2 on TruthfulQA. It also scored 7.44063 on MT-bench.
Good For
- General Language Generation: Suitable for a wide range of text generation and understanding tasks due to its strong benchmark results.
- Applications Requiring Longer Context: Its ability to handle over 10,000 input tokens makes it effective for tasks needing extensive contextual understanding.
- Research and Development: Provides a high-performing, openly available model for experimentation and integration into various AI applications.