0xA50C1A1/Qwen3-4B-Instruct-2507-SOM-MPOA
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 7, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The 0xA50C1A1/Qwen3-4B-Instruct-2507-SOM-MPOA model is a 4.0 billion parameter instruction-tuned causal language model, derived from unsloth/Qwen3-4B-Instruct-2507 and modified for decensorship using Heretic v1.2.0. It features a native context length of 262,144 tokens and is optimized for general capabilities including instruction following, logical reasoning, mathematics, coding, and long-context understanding. This model excels in reducing refusals compared to its original counterpart, making it suitable for applications requiring less restrictive content generation.

Loading preview...

Model Overview

This model, 0xA50C1A1/Qwen3-4B-Instruct-2507-SOM-MPOA, is a 4.0 billion parameter instruction-tuned causal language model based on the Qwen3 architecture. It is a decensored version of unsloth/Qwen3-4B-Instruct-2507, created using the Heretic v1.2.0 tool, specifically designed to reduce content refusals.

Key Capabilities & Enhancements

  • Decensored Output: Significantly reduces refusals, with 6 refusals per 100 prompts compared to 100/100 in the original model.
  • Extended Context Length: Supports a native context length of 262,144 tokens, enabling deep long-context understanding.
  • Improved General Abilities: Demonstrates substantial gains in instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage.
  • Enhanced Alignment: Shows better alignment with user preferences in subjective and open-ended tasks, leading to more helpful and higher-quality text generation.
  • Multilingual Coverage: Offers significant improvements in long-tail knowledge coverage across multiple languages.
  • Agentic Use: Excels in tool calling capabilities, with recommendations to use Qwen-Agent for optimal performance.

Performance Highlights

The model shows strong performance across various benchmarks, often outperforming its base model and other Qwen3 variants in its class:

  • Knowledge: Achieves 69.6 on MMLU-Pro and 84.2 on MMLU-Redux.
  • Reasoning: Scores 47.4 on AIME25 and 80.2 on ZebraLogic.
  • Coding: Reaches 35.1 on LiveCodeBench v6 and 76.8 on MultiPL-E.
  • Alignment: Attains 83.4 on IFEval and 83.5 on Creative Writing v3.
  • Agent: Scores 61.9 on BFCL-v3 and 48.7 on TAU1-Retail.

Good For

  • Applications requiring a less restrictive content policy.
  • Tasks demanding extensive long-context understanding.
  • General-purpose instruction following and complex reasoning.
  • Coding assistance and tool-use scenarios.
  • Multilingual content generation and comprehension.