ManniX-ITA/Qwen3.5-4B-M4-ex-LRP

VISIONConcurrency Cost:1Model Size:4.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Apr 30, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

ManniX-ITA/Qwen3.5-4B-M4-ex-LRP is a 4.5 billion parameter language model based on the Qwen3.5-4B architecture, created by ManniX-ITA. This model is a merge of two fine-tuned Qwen3.5-4B variants, utilizing an Explainable LRP (Ex-LRP) method for merge weighting, which is a key differentiator. It is part of a study comparing various merging recipes, with a focus on its performance in coding tasks like MBPP.

Loading preview...

Overview

ManniX-ITA/Qwen3.5-4B-M4-ex-LRP is a 4.5 billion parameter model derived from the Qwen3.5-4B base, developed by ManniX-ITA. It represents a specific merge variant (M4) within a comparative study of model merging techniques. This model was created using an Explainable LRP (Ex-LRP) method, which leverages AttnLRP relevance scores to determine merge weighting, as detailed in a mergekit pull request.

Key Characteristics

  • Base Model: Qwen/Qwen3.5-4B.
  • Source Models: Merged from Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled-v2 and Crownelius/Crow-4B-Opus-4.6-Distill-Heretic_Qwen3.5 with specific weightings (0.55 / 0.45).
  • Merging Method: Employs the ex-LRP recipe from mergekit PR #682, using LRP (Layer-wise Relevance Propagation) as the importance signal.
  • Performance: While the base Qwen3.5-4B model shows stronger HumanEval performance, the M4-v2 variant of this merge (ManniX-ITA/Qwen3.5-4B-M4-v2-ex-LRP-turbo) achieved the highest MBPP pass@1 score (52.20%) among the tested merges, surpassing both source models.
  • Context Length: Supports a context length of 32768 tokens.

Use Cases

This model is particularly relevant for researchers and developers interested in:

  • Model Merging Research: Understanding the impact of LRP-driven merging techniques on model performance.
  • Code Generation Tasks: The M4-v2 variant demonstrates improved performance on the MBPP benchmark, suggesting potential for code-related applications where MBPP is a relevant metric.
  • Comparative Analysis: It serves as a specific data point in a broader study of different merging recipes (e.g., DARE-TIES, OMv2, Fisher, LRP) and their effects on the Qwen3.5-4B architecture.