Name: ManniX-ITA/Qwen3.6-27B-Omnimerge-v4 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: ManniX-ITA

Overview

ManniX-ITA/Qwen3.6-27B-Omnimerge-v4 is a 27 billion parameter model built on the Qwen3.6 base, developed by ManniX-ITA. It's a DARE-TIES merge, combining the Qwen3.6 base with three specialized Qwen3.6 fine-tunes. A key innovation is its "MLP-passthrough surgery," which addresses a specific fragility in Qwen3.6's reasoning-tag emission policy, ensuring stable and reliable output, particularly for coding tasks.

Key Capabilities & Performance

Enhanced Reasoning: Achieves 78.28% pass@1 on the GPQA Diamond benchmark (full greedy result), demonstrating a significant +9.09 pp improvement over its predecessor, Omnimerge-v2.
Strong Coding Performance: Scores 83.54% on HumanEval pass@1 and 73.00% on MBPP pass@1 (corrected score), showing substantial gains over the Qwen3.6 base model (+15.40 pp on MBPP).
MLP-Passthrough: This unique architectural modification preserves the base model's MLP layers, preventing issues with unclosed <think> tags and ensuring robust performance in reasoning and coding scenarios.
Multimodal Support: The model retains the vision tower from the Qwen3.6 base, supporting multimodal applications through its MLX-VL-4bit quantization.
Optimized for Inference: Available in various quantizations including GGUF (for llama.cpp, ollama) and MLX 4-bit (for Apple Silicon, text-only and vision-language versions), with a companion MTP version offering 2x decode speedup for interactive workloads.

Should I use this for my use case?

This model is particularly well-suited for applications requiring:

Advanced Reasoning: Its strong GPQA Diamond scores make it ideal for complex analytical and problem-solving tasks.
Code Generation & Understanding: With high HumanEval and MBPP scores, it's excellent for programming assistance, code completion, and understanding code logic.
Robust Output: The MLP-passthrough ensures consistent and reliable output, especially when dealing with reasoning-intensive prompts that might otherwise trigger problematic tag emissions.
Multimodal Applications: If your use case involves processing both text and image inputs, the MLX-VL-4bit version provides full vision-language capabilities.
Efficient Deployment: The availability of GGUF and MLX quantizations, along with an MTP version for faster decoding, makes it versatile for various deployment environments, from local machines to high-throughput servers.

Overview

Overview

Key Capabilities & Performance

Should I use this for my use case?

Full Model Card (README)