djuna/L3.1-Promissum_Mane-8B-Della-1.5-calc
djuna/L3.1-Promissum_Mane-8B-Della-1.5-calc is an 8 billion parameter language model merged using the Della method, based on unsloth/Meta-Llama-3.1-8B. This model integrates three distinct 8B models: DreadPoor/Spei_Meridiem-8B-model_stock, DreadPoor/Aspire1.1-8B-model_stock, and DreadPoor/Heart_Stolen1.1-8B-Model_Stock. It features a 32768 token context length and is designed for general language tasks, with specific evaluation results available for reasoning and mathematical benchmarks.
Loading preview...
Model Overview
djuna/L3.1-Promissum_Mane-8B-Della-1.5-calc is an 8 billion parameter language model created by djuna through a Della merge using mergekit. It is built upon the unsloth/Meta-Llama-3.1-8B base model and combines the strengths of three distinct 8B models from DreadPoor: Spei_Meridiem-8B-model_stock, Aspire1.1-8B-model_stock, and Heart_Stolen1.1-8B-Model_Stock.
Key Capabilities & Performance
This model leverages the Della merge method with specific parameters including density: 1, lambda: 1.05, and epsilon: 0.04, and supports a context length of 32768 tokens. Evaluation on the Open LLM Leaderboard shows an average score of 29.18. Notable performance metrics include:
- IFEval (0-Shot): 72.35
- BBH (3-Shot): 34.88
- MATH Lvl 5 (4-Shot): 13.97
- MMLU-PRO (5-shot): 32.26
When to Use This Model
This model is suitable for general language understanding and generation tasks, particularly where a blend of capabilities from its constituent models is desired. Its performance on IFEval suggests potential for instruction following, while its MATH Lvl 5 score indicates some mathematical reasoning ability. Developers looking for a merged Llama 3.1-based model with specific characteristics from the included DreadPoor models may find this useful.