ManniX-ITA/Qwen3.6-27B-Omnimerge-v4

VISIONConcurrency Cost:2Model Size:27BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Apr 29, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

ManniX-ITA/Qwen3.6-27B-Omnimerge-v4 is a 27 billion parameter language model based on the Qwen3.6 architecture, developed by ManniX-ITA. This model is a DARE-TIES merge of the Qwen3.6 base with three Qwen3.6 fine-tunes, featuring a unique "MLP-passthrough surgery" to address a fragility in Qwen3.6's reasoning-tag emission policy. It excels in reasoning and coding tasks, achieving 78.28% on GPQA Diamond and 83.54% on HumanEval, making it suitable for complex problem-solving and code generation.

Loading preview...

Overview

ManniX-ITA/Qwen3.6-27B-Omnimerge-v4 is a 27 billion parameter model built on the Qwen3.6 base, developed by ManniX-ITA. It's a DARE-TIES merge, combining the Qwen3.6 base with three specialized Qwen3.6 fine-tunes. A key innovation is its "MLP-passthrough surgery," which addresses a specific fragility in Qwen3.6's reasoning-tag emission policy, ensuring stable and reliable output, particularly for coding tasks.

Key Capabilities & Performance

  • Enhanced Reasoning: Achieves 78.28% pass@1 on the GPQA Diamond benchmark (full greedy result), demonstrating a significant +9.09 pp improvement over its predecessor, Omnimerge-v2.
  • Strong Coding Performance: Scores 83.54% on HumanEval pass@1 and 73.00% on MBPP pass@1 (corrected score), showing substantial gains over the Qwen3.6 base model (+15.40 pp on MBPP).
  • MLP-Passthrough: This unique architectural modification preserves the base model's MLP layers, preventing issues with unclosed <think> tags and ensuring robust performance in reasoning and coding scenarios.
  • Multimodal Support: The model retains the vision tower from the Qwen3.6 base, supporting multimodal applications through its MLX-VL-4bit quantization.
  • Optimized for Inference: Available in various quantizations including GGUF (for llama.cpp, ollama) and MLX 4-bit (for Apple Silicon, text-only and vision-language versions), with a companion MTP version offering 2x decode speedup for interactive workloads.

Should I use this for my use case?

This model is particularly well-suited for applications requiring:

  • Advanced Reasoning: Its strong GPQA Diamond scores make it ideal for complex analytical and problem-solving tasks.
  • Code Generation & Understanding: With high HumanEval and MBPP scores, it's excellent for programming assistance, code completion, and understanding code logic.
  • Robust Output: The MLP-passthrough ensures consistent and reliable output, especially when dealing with reasoning-intensive prompts that might otherwise trigger problematic tag emissions.
  • Multimodal Applications: If your use case involves processing both text and image inputs, the MLX-VL-4bit version provides full vision-language capabilities.
  • Efficient Deployment: The availability of GGUF and MLX quantizations, along with an MTP version for faster decoding, makes it versatile for various deployment environments, from local machines to high-throughput servers.