ManniX-ITA/Qwen3.5-4B-M3-Fisher

VISIONConcurrency Cost:1Model Size:4.5BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Apr 30, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

ManniX-ITA/Qwen3.5-4B-M3-Fisher is a 4.5 billion parameter language model based on Qwen/Qwen3.5-4B, enhanced with an OMv2 recipe and diagonal Fisher information weighting. This model achieves a HumanEval pass@1 score of 57.93%, representing a significant improvement over the base OMv2 recipe. It is specifically optimized for code generation tasks, demonstrating strong performance in programming benchmarks.

Loading preview...

Qwen3.5-4B-M3-Fisher: Enhanced Code Generation

This model, developed by ManniX-ITA, is a 4.5 billion parameter variant of the Qwen3.5-4B base model. It incorporates an advanced merging technique known as the OMv2 recipe (OBIM-lite + DAREx-q + EMR election) further enhanced by diagonal Fisher information weighting. This Fisher signal is crucial for driving the DAREx-q sparsification process, leading to improved performance.

Key Capabilities & Differentiators

  • Superior Code Generation: Achieves a HumanEval pass@1 score of 57.93%, which is a notable +5.49 percentage point improvement over the OMv2 recipe without Fisher weighting. This makes it particularly strong in programming-related tasks.
  • Advanced Merging Methodology: Utilizes a sophisticated OMv2 recipe combined with Fisher information, distinguishing it from simpler merging approaches like vanilla DARE-TIES.
  • Targeted Optimization: The integration of Fisher information specifically targets and enhances the model's ability in code-centric evaluations.

When to Use This Model

  • Code Development: Ideal for applications requiring robust code generation, completion, or debugging assistance.
  • Research in Model Merging: Useful for researchers exploring the impact of importance signals, such as Fisher information, in model merging techniques.
  • Benchmarking Code Performance: Provides a strong baseline for evaluating code-specific LLM performance, especially within the 4B parameter range.