Yuma42/KangalKhan-Sapphire-7B
KangalKhan-Sapphire-7B is a 7 billion parameter language model developed by Yuma42, created by merging argilla/CapybaraHermes-2.5-Mistral-7B and argilla/distilabeled-OpenHermes-2.5-Mistral-7B using slerp. This model, built on the Mistral architecture, features a 4096-token context length and demonstrates strong general-purpose performance across various benchmarks, including reasoning, common sense, and language understanding tasks. It is suitable for applications requiring robust conversational AI and text generation capabilities.
Loading preview...
KangalKhan-Sapphire-7B Overview
KangalKhan-Sapphire-7B is a 7 billion parameter language model developed by Yuma42. It is a product of merging two base models, argilla/CapybaraHermes-2.5-Mistral-7B and argilla/distilabeled-OpenHermes-2.5-Mistral-7B, utilizing the slerp merge method. This approach combines the strengths of its constituent models to offer enhanced performance.
Key Capabilities & Performance
Evaluated on the Open LLM Leaderboard, KangalKhan-Sapphire-7B achieves an average score of 68.52. Specific benchmark results include:
- AI2 Reasoning Challenge (25-Shot): 66.30
- HellaSwag (10-Shot): 85.34
- MMLU (5-Shot): 63.32
- TruthfulQA (0-shot): 56.09
- Winogrande (5-shot): 78.14
- GSM8k (5-shot): 61.94
These scores indicate strong performance across a range of tasks, including reasoning, common sense, and language understanding. The model is configured with a 4096-token context length and supports bfloat16 for efficient computation.
Recommended Use Cases
This model is well-suited for general-purpose applications requiring robust text generation and conversational AI. Its balanced performance across various benchmarks makes it a versatile choice for tasks such as:
- Content creation
- Chatbot development
- Question answering
- Reasoning-based tasks