huihui-ai/Llama-3.1-8B-Fusion-7030
huihui-ai/Llama-3.1-8B-Fusion-7030 is an 8 billion parameter Llama 3.1-based mixed model, created by huihui-ai, that blends arcee-ai/Llama-3.1-SuperNova-Lite (70%) and mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated (30%). This experimental fusion aims to combine the strengths of both base models, demonstrating usability without generating gibberish. It shows strong performance on IF_Eval and GPQA benchmarks, making it suitable for tasks requiring robust instruction following and general knowledge.
Loading preview...
Model Overview
huihui-ai/Llama-3.1-8B-Fusion-7030 is an experimental 8 billion parameter language model based on the Llama 3.1 architecture. Developed by huihui-ai, this model is a blend of two distinct Llama-based models: arcee-ai/Llama-3.1-SuperNova-Lite and mlabonne/Meta-Llama-3.1-8B-Instruct-abliterated. The fusion uses a 7:3 ratio, with 70% of the weights from SuperNova-Lite and 30% from the abliterated Meta-Llama-3.1-8B-Instruct model.
Key Characteristics
- Mixed Architecture: Combines the strengths of a Llama-3.1-8B-Instruct-based model from Arcee.ai and an uncensored Llama 3.1 8B Instruct variant.
- Experimental Fusion: Part of a series of experiments by huihui-ai to evaluate the impact of different mixing ratios on model performance and coherence.
- Usability: Despite being a simple weight blend, the model maintains usability and does not produce incoherent or "gibberish" outputs.
Performance Highlights
Evaluations indicate that Llama-3.1-8B-Fusion-7030 achieves competitive scores on several benchmarks:
- IF_Eval: Achieves 83.10, outperforming both base models and other fusion ratios.
- GPQA: Scores 32.61, also surpassing its base components and other fusion variants.
- MMLU Pro, TruthfulQA, BBH: While not leading, it maintains strong performance, demonstrating a balanced capability across various reasoning and knowledge tasks.
This model is particularly well-suited for applications requiring a blend of robust instruction following and general knowledge, benefiting from the combined characteristics of its parent models.