agentlans/Llama3.1-SuperDeepFuse

Warm
Public
8B
FP8
32768
License: llama3.1
Hugging Face
Overview

Llama3.1-SuperDeepFuse: Merged for Enhanced Reasoning

Llama3.1-SuperDeepFuse is an 8 billion parameter language model developed by agentlans, built upon the meta-llama/Llama-3.1-8B-Instruct base. This model distinguishes itself by merging three high-performance distilled models: arcee-ai/Llama-3.1-SuperNova-Lite, deepseek-ai/DeepSeek-R1-Distill-Llama-8B, and FuseAI/FuseChat-Llama-3.1-8B-Instruct, using the model_stock merge method.

Key Capabilities

  • Enhanced Multi-Task Reasoning: Designed to improve complex problem-solving across various domains.
  • Improved Mathematical and Coding Performance: Specifically targets better accuracy and utility in quantitative and programming tasks.
  • Multilingual Support: Offers capabilities across multiple languages.
  • Consumer GPU Deployment: Optimized for accessibility on standard hardware.

Performance Notes

The model maintains Llama 3.1's safety standards and aims for balanced performance. While still undergoing benchmarking, initial evaluations on the Open LLM Leaderboard show an Average score of 27.30%, with notable results in IFEval (77.62%) and MMLU-PRO (30.83%). It's important to note that, like all language models, it can produce misleading output, and results should be independently verified. Its capabilities are limited compared to larger model variants.