ManniX-ITA/Qwen3.5-4B-M2-OMv2
ManniX-ITA/Qwen3.5-4B-M2-OMv2 is a 4.5 billion parameter language model based on Qwen/Qwen3.5-4B, featuring a 32768-token context length. This model is a merge using the OMv2 recipe (OBIM-lite + DAREx-q + EMR election) without importance-signal weighting, designed to isolate the recipe's contribution. It is part of a study exploring different merging methodologies, showing improved performance on the MBPP code generation benchmark compared to its source models.
Loading preview...
ManniX-ITA/Qwen3.5-4B-M2-OMv2 Overview
This model is a 4.5 billion parameter language model derived from the Qwen/Qwen3.5-4B base, utilizing a 32768-token context window. It represents the 'M2' variant within a broader study by ManniX-ITA, focusing on the effectiveness of different model merging recipes. Specifically, Qwen3.5-4B-M2-OMv2 employs the OMv2 recipe, which combines OBIM-lite, DAREx-q, and EMR election techniques, notably without any importance-signal weighting.
Key Capabilities & Characteristics
- Merging Methodology: Implements the OMv2 recipe, a specific combination of merging techniques, to blend two fine-tuned source models:
Jackrong/Qwen3.5-4B-Claude-4.6-Opus-Reasoning-Distilled-v2andCrownelius/Crow-4B-Opus-4.6-Distill-Heretic_Qwen3.5. - Performance on Code Tasks: While the base Qwen3.5-4B model shows strong HumanEval pass@1 performance (60.37%), this merged variant (M2) achieves 49.40% on MBPP pass@1, demonstrating an improvement over the best source model's 48.20% on MBPP. This suggests merging can enhance specific code generation capabilities.
- Research Focus: This model is a component of a research effort to isolate and understand the contributions of various merging recipes, particularly in the absence of importance-signal weighting.
Good For
- Research into Model Merging: Ideal for researchers and developers interested in studying the impact of specific merging algorithms like OMv2 on model performance, especially in code generation tasks.
- Code Generation Tasks: While not outperforming the base model on HumanEval, its improved MBPP score indicates potential for certain code-related applications where MBPP-style problems are relevant.
- Comparative Analysis: Useful for comparing the effectiveness of different merging strategies against a common base model and source fine-tunes, as part of the larger ManniX-ITA study.