Alelcv27/Qwen2.5-7B-DELLA-v1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 5, 2026Architecture:Transformer Warm

Alelcv27/Qwen2.5-7B-DELLA-v1 is a 7.6 billion parameter language model based on the Qwen2.5-7B-Instruct architecture, created using the DELLA merge method. This model specifically combines specialized versions of Qwen2.5-7B for code generation and mathematical reasoning. It is designed to offer enhanced performance in both programming tasks and complex mathematical problem-solving, leveraging its 32K context window.

Loading preview...

Model Overview

Alelcv27/Qwen2.5-7B-DELLA-v1 is a 7.6 billion parameter language model built upon the Qwen2.5-7B-Instruct base. This model was developed using the advanced DELLA merge method, which intelligently combines the strengths of multiple specialized models.

Key Capabilities

This model is a composite of two distinct Qwen2.5-7B variants, specifically targeting:

  • Code Generation: Incorporates capabilities from Alelcv27/Qwen2.5-7B-Code-v2, suggesting proficiency in understanding and generating programming code.
  • Mathematical Reasoning: Integrates features from Alelcv27/Qwen2.5-7B-Math-CoT, indicating an enhanced ability to handle mathematical problems and Chain-of-Thought reasoning.

Merge Details

The DELLA merge method was applied to combine Qwen/Qwen2.5-7B-Instruct with the specialized code and math models. The configuration involved specific layer ranges and weighting parameters for each contributing model, optimizing for a balanced performance across its intended domains. This approach aims to create a versatile model that excels in both technical coding and analytical mathematical tasks, making it suitable for applications requiring strong logical and computational abilities.