kyujinpy/Sakura-SOLAR-Instruct
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:10.7BQuant:FP8Ctx Length:4kPublished:Dec 24, 2023License:cc-by-nc-sa-4.0Architecture:Transformer0.0K Open Weights Warm

Sakura-SOLAR-Instruct is an instruction-tuned causal language model developed by Kyujin Han (kyujinpy) in collaboration with Media Group Saramgwasup and Marker. This model was created using Mergekit and achieved a notable average score of 74.40 on the Open LLM Leaderboard, ranking first on December 27, 2023. It demonstrates strong performance across various benchmarks including ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K, making it suitable for general-purpose instruction following and reasoning tasks.

Loading preview...

Sakura-SOLAR-Instruct Overview

Sakura-SOLAR-Instruct is an instruction-tuned language model developed by Kyujin Han (kyujinpy) as part of an LLM research consortium with Media Group Saramgwasup and Marker. This model was constructed using the Mergekit method, indicating a focus on combining existing models to achieve enhanced performance. Detailed information regarding its development, including training and code, is available in the ⭐Sakura-SOLAR repository.

Key Capabilities

  • Strong General Instruction Following: Achieved an average score of 74.40 on the Open LLM Leaderboard, securing the top rank on December 27, 2023.
  • Reasoning and Common Sense: Demonstrates solid performance in reasoning tasks with scores like 70.99 on ARC and 83.66 on Winogrande.
  • Knowledge and Factuality: Scored 66.33 on MMLU and 71.79 on TruthfulQA, indicating a good grasp of general knowledge and factual accuracy.
  • Mathematical Reasoning: Achieved 65.20 on GSM8K, suggesting capabilities in mathematical problem-solving.

Good for

  • Applications requiring a robust instruction-following model with strong general reasoning abilities.
  • Tasks benefiting from a model with competitive benchmark performance across a range of academic and common sense evaluations.
  • Developers looking for a model with a transparent development process, as training and code details are publicly shared.
Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p