ManniX-ITA/Qwen3.5-4B-M5-OMv2-LRP
ManniX-ITA/Qwen3.5-4B-M5-OMv2-LRP is a 4.5 billion parameter language model based on the Qwen3.5-4B architecture, developed by ManniX-ITA. This model utilizes the OMv2 recipe (OBIM-lite + DAREx-q + EMR election) with AttnLRP relevance scores for sparsification, distinguishing it from other merged models. It achieves a 51.40% pass@1 on the MBPP benchmark and a balanced score of 53.05% / 51.40%, making it particularly strong for code generation tasks.
Loading preview...
Model Overview: Qwen3.5-4B-M5-OMv2-LRP
This model, developed by ManniX-ITA, is a 4.5 billion parameter variant built upon the Qwen/Qwen3.5-4B base model. It is a merged model, combining Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled-v2 and Crownelius/Crow-4B-Opus-4.6-Distill-Heretic_Qwen3.5 using specific weighting.
Key Differentiator
The M5 in its name signifies its use of the OMv2 recipe (OBIM-lite + DAREx-q + EMR election) with AttnLRP relevance scores as the importance signal for DAREx-q sparsification. This approach aims to optimize performance by selectively merging parameters based on their relevance.
Performance Highlights
Evaluated under identical conditions, the model demonstrates strong performance in code generation benchmarks:
- MBPP pass@1: Achieves 51.40%, which is noted as the best result of the study for this benchmark and significantly outperforms the source models (e.g., Crow at 48.20%).
- HumanEval pass@1: Scores 53.05%.
It's important to note that while merging improves MBPP capability, the base Qwen3.5-4B model still holds the highest HumanEval score (60.37%) among the tested variants.
Use Cases
Given its strong performance on the MBPP benchmark, this model is particularly well-suited for:
- Code generation tasks: Excelling in problems requiring programmatic solutions.
- Research into model merging techniques: Specifically, for understanding the impact of AttnLRP relevance scores in sparsification within the OMv2 recipe.