skilledu/qwen3-4b-heretic

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 1, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

skilledu/qwen3-4b-heretic is a 4 billion parameter language model based on the Qwen 3 architecture, developed by skilledu. This model has been processed using the Heretic tool to significantly reduce refusal behaviors while maintaining its original quality. It is primarily optimized for use as an uncensored text encoder in image generation models like Z-Image and FLUX.2 Klein 4B, offering a 32768 token context length.

Loading preview...

skilledu/qwen3-4b-heretic: Abliterated Qwen 3 4B

This model is an abliterated version of the Qwen 3 4B base model, processed using the Heretic v1.2.0 tool. Its primary distinction is a drastic reduction in refusal behaviors, achieving only 3 refusals out of 100 trials compared to the original model's 100/100 refusals. Crucially, this refusal reduction was achieved with zero measurable KL divergence, indicating no damage to the model's core capabilities or quality.

Key Capabilities & Features

  • Reduced Refusals: Significantly less prone to refusing prompts, making it more versatile for various applications.
  • Quality Preservation: Maintains the full quality and performance of the base Qwen 3 4B model.
  • Optimized for Image Generation: Specifically designed to function as an uncensored text encoder for models such as Z-Image and FLUX.2 Klein 4B.
  • Flexible Formats: Available in HuggingFace, ComfyUI (bf16, FP8, NVFP4), and GGUF (various quantizations including Q4_K_M recommended) formats for broad compatibility.
  • NVFP4 Support: Features NVFP4 variants for smaller file sizes and native loading in ComfyUI, with potential for native FP4 tensor core acceleration on Blackwell GPUs.

Use Cases

This model is ideal for developers and artists requiring a less censored text encoder for image generation workflows, particularly within ComfyUI environments. Its reduced refusal rate makes it suitable for creative and open-ended prompting without encountering content restrictions inherent in many base models.