Yuma42/Llama3.1-SuperHawk-8B
Yuma42/Llama3.1-SuperHawk-8B is an 8 billion parameter language model with a 32,768 token context length, created by Yuma42 through a merge of Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base and mukaj/Llama-3.1-Hawkish-8B using LazyMergekit. This model is designed for general language understanding and generation tasks, demonstrating a balanced performance across various benchmarks including reasoning and instruction following. Its primary use case is for applications requiring a capable 8B model with a strong foundation from merged Llama 3.1 derivatives.
Loading preview...
Llama3.1-SuperHawk-8B Overview
Yuma42/Llama3.1-SuperHawk-8B is an 8 billion parameter language model built upon the Llama 3.1 architecture. It was created by Yuma42 using LazyMergekit, combining two distinct Llama 3.1-based models: Joseph717171/Llama-3.1-SuperNova-8B-Lite_TIES_with_Base and mukaj/Llama-3.1-Hawkish-8B. This merging strategy aims to leverage the strengths of its constituent models to offer enhanced performance across a range of tasks.
Key Capabilities
- Instruction Following (IFEval): Achieves 79.86% on IFEval (0-Shot), indicating strong ability to follow instructions.
- General Reasoning (BBH): Scores 31.97% on BBH (3-Shot), demonstrating moderate complex reasoning capabilities.
- Mathematical Reasoning: Attains 23.49% on MATH Lvl 5 (4-Shot).
- Multitask Language Understanding (MMLU-PRO): Scores 32.73% on MMLU-PRO (5-shot).
Good For
- General-purpose text generation: Suitable for a wide array of applications requiring coherent and contextually relevant text.
- Instruction-tuned applications: Its strong IFEval score suggests effectiveness in tasks where precise instruction adherence is critical.
- Exploratory development: Provides a solid 8B foundation for developers experimenting with merged Llama 3.1 models.