DreamFast/qwen3-8b-heretic

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 20, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

DreamFast/qwen3-8b-heretic is an 8 billion parameter Qwen3-based language model, created by DreamFast using the Heretic abliteration method. This model features significantly reduced refusal rates (13/100 vs 100/100 for the base model) while maintaining model quality, making it highly suitable as an uncensored text encoder for image generation models like Klein 9B. It offers a 32768 token context length and is available in various formats including HuggingFace, ComfyUI (with NVFP4 quantization for Blackwell GPUs), and GGUF.

Loading preview...

Overview

DreamFast/qwen3-8b-heretic is an 8 billion parameter language model derived from Qwen's Qwen3-8B, specifically processed using the Heretic v1.2.0 abliteration tool. The primary goal of this modification is to significantly reduce model refusals from 100/100 to 13/100, making it more permissive while preserving the original model's quality, as indicated by a low KL Divergence of 0.0838.

Key Capabilities & Features

  • Reduced Refusals: Achieves an 87% reduction in refusal rates compared to the base Qwen3-8B model.
  • High Quality: Abliteration process maintains model integrity with minimal damage.
  • Optimized for Image Generation: Designed to function effectively as an uncensored text encoder, particularly for models like Klein 9B.
  • Flexible Formats: Provided in HuggingFace, ComfyUI (bf16, FP8, NVFP4), and GGUF (various quantizations including Q4_K_M recommended) formats.
  • NVFP4 Quantization: Offers highly efficient NVFP4 (4-bit floating point) variants, ideal for Blackwell GPUs (RTX 5090/5080) with native FP4 tensor cores, and supported on older GPUs via software dequantization.

Recommended Use Cases

  • Uncensored Text Encoding: Ideal for applications requiring a less restrictive text encoder, especially in conjunction with image generation models.
  • Creative Content Generation: Suitable for scenarios where the base model's refusal behavior might hinder creative or open-ended text generation.
  • Resource-Efficient Deployment: GGUF and NVFP4 formats enable deployment on systems with varying hardware constraints, from high-end GPUs to low VRAM setups.