Name: ghost-actual/Qwen3.5-4B-Claude-Opus-4.6-Distilled-heretic API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ghost-actual

Overview

ghost-actual/Qwen3.5-4B-Claude-Opus-4.6-Distilled-heretic is a 4.5 billion parameter model developed by Ghost, built upon the Qwen3.5 architecture. It integrates Claude Opus 4.6's advanced reasoning capabilities, which have been meticulously preserved while safety refusals were removed using the Heretic tool. This process resulted in an exceptionally low refusal rate of 4/100 and a KL Divergence of 0.0680, indicating near-zero loss of original model intelligence.

Key Capabilities

Claude Opus 4.6 Reasoning: Inherits sophisticated chain-of-thought reasoning from Claude Opus 4.6 distillation.
Abliterated Safety Refusals: Engineered to provide uncensored responses without compromising reasoning quality.
Hybrid Architecture: Utilizes a Qwen3.5 Gated DeltaNet + conventional attention pattern, including native multimodal support.
Extended Context: Features a 262K native context length, extensible to over 1M tokens.
Efficient VRAM Usage: Runs on approximately 8-9 GB VRAM in BF16/FP16, making it accessible on consumer-grade GPUs like the RTX 3060.

Good for

Applications requiring high-quality, uncensored reasoning in a compact 4.5B parameter footprint.
Scenarios where preserving Claude-style chain-of-thought is critical, but safety layers need to be minimized or removed.
Developers seeking small, intelligent models that avoid the common pitfalls of other uncensored alternatives (e.g., reduced intelligence or reliance on less capable distillation sources).

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)