p-e-w/Qwen3-4B-Instruct-2507-heretic

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Nov 15, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

p-e-w/Qwen3-4B-Instruct-2507-heretic is a 4 billion parameter instruction-tuned causal language model, derived from Qwen/Qwen3-4B-Instruct-2507, with a massive 262,144 token context length. This model has been decensored using the Heretic tool, significantly reducing refusals from 99/100 to 21/100 while maintaining the original model's enhanced capabilities in instruction following, logical reasoning, mathematics, coding, and long-context understanding. It is primarily designed for applications requiring a highly capable, open-ended language model with reduced content restrictions.

Loading preview...

Model Overview

p-e-w/Qwen3-4B-Instruct-2507-heretic is a 4 billion parameter causal language model, based on the Qwen3-4B-Instruct-2507 architecture, featuring an impressive 262,144 token native context length. This version is specifically notable for being a decensored variant, created using the Heretic v1.0.0 tool, which significantly lowers its refusal rate from 99/100 to 21/100 compared to the original model.

Key Capabilities & Enhancements

This model inherits and builds upon the Qwen3-4B-Instruct-2507's strengths, offering substantial improvements across various domains:

  • General Capabilities: Enhanced instruction following, logical reasoning, text comprehension, mathematics, science, coding, and tool usage.
  • Long-Context Understanding: Excels with its 256K long-context understanding, making it suitable for complex, multi-turn interactions or extensive document analysis.
  • Alignment & Subjectivity: Demonstrates better alignment with user preferences in subjective and open-ended tasks, leading to more helpful responses and higher-quality text generation.
  • Multilingual Support: Features substantial gains in long-tail knowledge coverage across multiple languages.

Performance & Differentiation

While maintaining the high performance of its base model in areas like MMLU-Pro (69.6), AIME25 (47.4), and Creative Writing v3 (83.5), the primary differentiator of this 'heretic' version is its reduced content moderation. This makes it particularly suitable for use cases where the original model's high refusal rate might be restrictive, offering a more unconstrained generative experience. The model operates in a 'non-thinking mode', simplifying its output generation without requiring explicit enable_thinking=False settings.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p