Name: ChiKoi7/GPT-5-Distill-llama3.2-3B-Instruct-Heretic API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ChiKoi7

Overview

ChiKoi7/GPT-5-Distill-llama3.2-3B-Instruct-Heretic is a 3.2 billion parameter instruction-tuned model built upon the Llama 3.2 architecture. It is a decensored version of Jackrong/GPT-5-Distill-llama3.2-3B-Instruct, processed using the Heretic v1.1.0 tool to significantly reduce refusals in both English and Chinese. The original model was a high-efficiency distillation attempt, trained on GPT-5 responses to mimic superior reasoning and conversational patterns, filtered for "normal" (flawless) responses from the LMSYS dataset.

Key Capabilities & Features

Decensored Output: Achieves significantly lower refusal rates (3/100 English, 7/100 Chinese) compared to its base model (97/100 English, 88/100 Chinese) due to double-pass Heretic processing.
GPT-5 Distilled Logic: Inherits conversational style, politeness, and reasoning structure from over 100,000 filtered GPT-5 responses.
Lightweight & Efficient: With ~3.2B parameters, it's optimized for edge devices and consumer GPUs.
Long Context Window: Supports a maximum context length of 32,768 tokens, suitable for processing moderate-sized documents.
Dual-Language Support: Originally an English/Chinese model, its decensoring process was applied to both languages.
GGUF Ready: Quantized versions are available for efficient deployment.

Recommended Use Cases

On-Device Chat: Ideal for deployment on laptops, phones, and systems with low VRAM.
Reasoning & Explanations: Provides clear answers, benefiting from distilled GPT-5 logic.
Summarization & Rewriting: Strong capabilities in both English and Chinese.
RAG Applications: The 32K context window supports retrieval-augmented generation tasks.
Censorship-Resistant Applications: Suitable for use cases where reduced model refusals are critical.

Overview

Overview

Key Capabilities & Features

Recommended Use Cases

Full Model Card (README)