8sp4rk/Qwopus3.6-27B-Coder-heretic

VISIONConcurrency Cost:2Model Size:27BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:Jun 17, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Qwopus3.6-27B-Coder-heretic is a 27 billion parameter model developed by 8sp4rk, based on the Jackrong/Qwopus3.6-27B-Coder and Qwen3.5 architecture with a 32768 token context length. This version has been 'abliterated' using Heretic v1.4.0 to significantly reduce refusal behavior while preserving its core agentic coding and tool-use capabilities. It is optimized for code generation and reasoning tasks without typical safety alignments.

Loading preview...

Qwopus3.6-27B-Coder-heretic: Decensored Agentic Coder

This model, developed by 8sp4rk, is a 27 billion parameter variant of the Jackrong/Qwopus3.6-27B-Coder, built upon the Qwen3.5 hybrid SSM-attention architecture. Its primary distinction is the removal of most refusal behaviors through an 'abliteration' process using Heretic v1.4.0, an automatic, optimization-based directional ablation tool. This process significantly reduces safety alignment without compromising the model's original capabilities.

Key Capabilities & Features

  • Decensored Responses: Achieves a 96% reduction in refusal rates (from 85 to 3 refusals per 100 harmful prompts) compared to its base model.
  • Capability Preservation: Maintains the base model's agentic coding and tool-use abilities, with a low KL divergence (0.0133) indicating minimal damage to core functionality.
  • Code Generation & Reasoning: Inherits strong performance in coding and reasoning tasks from its base.
  • Flexible Deployment: Available in full-precision BF16 safetensors and various GGUF quantizations (Q3_K_M to Q8_0) for use with llama.cpp and Ollama.
  • 32K Context Window: Supports a context length of 32,768 tokens.

Ideal Use Cases

  • Unrestricted Code Generation: Suitable for developers requiring a coding assistant without built-in refusal mechanisms.
  • Research into Model Alignment: Valuable for studying the effects of safety alignment removal and model behavior in less constrained environments.
  • Local Development: Optimized GGUF formats allow for efficient deployment on consumer hardware, including GPUs with 24 GB VRAM (e.g., Q4_K_M).

This model is provided for research and unrestricted local use, with users responsible for its deployment and outputs.