ManniX-ITA/Qwen3.5-4B-M3-Fisher
ManniX-ITA/Qwen3.5-4B-M3-Fisher is a 4.5 billion parameter language model based on Qwen/Qwen3.5-4B, enhanced with an OMv2 recipe and diagonal Fisher information weighting. This model achieves a HumanEval pass@1 score of 57.93%, representing a significant improvement over the base OMv2 recipe. It is specifically optimized for code generation tasks, demonstrating strong performance in programming benchmarks.
Loading preview...
Qwen3.5-4B-M3-Fisher: Enhanced Code Generation
This model, developed by ManniX-ITA, is a 4.5 billion parameter variant of the Qwen3.5-4B base model. It incorporates an advanced merging technique known as the OMv2 recipe (OBIM-lite + DAREx-q + EMR election) further enhanced by diagonal Fisher information weighting. This Fisher signal is crucial for driving the DAREx-q sparsification process, leading to improved performance.
Key Capabilities & Differentiators
- Superior Code Generation: Achieves a HumanEval pass@1 score of 57.93%, which is a notable +5.49 percentage point improvement over the OMv2 recipe without Fisher weighting. This makes it particularly strong in programming-related tasks.
- Advanced Merging Methodology: Utilizes a sophisticated OMv2 recipe combined with Fisher information, distinguishing it from simpler merging approaches like vanilla DARE-TIES.
- Targeted Optimization: The integration of Fisher information specifically targets and enhances the model's ability in code-centric evaluations.
When to Use This Model
- Code Development: Ideal for applications requiring robust code generation, completion, or debugging assistance.
- Research in Model Merging: Useful for researchers exploring the impact of importance signals, such as Fisher information, in model merging techniques.
- Benchmarking Code Performance: Provides a strong baseline for evaluating code-specific LLM performance, especially within the 4B parameter range.