upstage/SOLAR-0-70b-16bit

TEXT GENERATIONConcurrency Cost:4Model Size:69BQuant:FP8Ctx Length:32kPublished:Jul 30, 2023Architecture:Transformer0.3K Cold

The SOLAR-0-70b-16bit model by Upstage is a 69 billion parameter instruction-tuned causal language model, fine-tuned from the LLaMA-2 backbone. It is optimized for general language understanding and generation tasks, demonstrating strong performance across various benchmarks including ARC-Challenge, HellaSwag, MMLU, and TruthfulQA. This model is notable for its ability to handle longer input sequences, supporting over 10,000 tokens through dynamic rope scaling.

Loading preview...

Overview

Upstage's SOLAR-0-70b-16bit is a 69 billion parameter instruction-tuned language model, developed by fine-tuning the LLaMA-2 backbone. It is designed for general-purpose language tasks and has achieved a top ranking on the HuggingFace Open LLM Leaderboard, indicating strong performance relative to other open models.

Key Capabilities

  • Instruction Following: Fine-tuned with Orca-style and Alpaca-style datasets for robust instruction adherence.
  • Extended Context Handling: Utilizes rope_scaling with a dynamic factor of 2, enabling it to process input sequences exceeding 10,000 tokens.
  • Benchmark Performance: Achieves an average H4 score of 73, with specific scores of 71.1 on ARC-Challenge, 87.9 on HellaSwag, 70.6 on MMLU, and 62.2 on TruthfulQA. It also scored 7.44063 on MT-bench.

Good For

  • General Language Generation: Suitable for a wide range of text generation and understanding tasks due to its strong benchmark results.
  • Applications Requiring Longer Context: Its ability to handle over 10,000 input tokens makes it effective for tasks needing extensive contextual understanding.
  • Research and Development: Provides a high-performing, openly available model for experimentation and integration into various AI applications.