Overview
ChiKoi7/GPT-5-Distill-llama3.2-3B-Instruct-Heretic is a 3.2 billion parameter instruction-tuned model built upon the Llama 3.2 architecture. It is a decensored version of Jackrong/GPT-5-Distill-llama3.2-3B-Instruct, processed using the Heretic v1.1.0 tool to significantly reduce refusals in both English and Chinese. The original model was a high-efficiency distillation attempt, trained on GPT-5 responses to mimic superior reasoning and conversational patterns, filtered for "normal" (flawless) responses from the LMSYS dataset.
Key Capabilities & Features
- Decensored Output: Achieves significantly lower refusal rates (3/100 English, 7/100 Chinese) compared to its base model (97/100 English, 88/100 Chinese) due to double-pass Heretic processing.
- GPT-5 Distilled Logic: Inherits conversational style, politeness, and reasoning structure from over 100,000 filtered GPT-5 responses.
- Lightweight & Efficient: With ~3.2B parameters, it's optimized for edge devices and consumer GPUs.
- Long Context Window: Supports a maximum context length of 32,768 tokens, suitable for processing moderate-sized documents.
- Dual-Language Support: Originally an English/Chinese model, its decensoring process was applied to both languages.
- GGUF Ready: Quantized versions are available for efficient deployment.
Recommended Use Cases
- On-Device Chat: Ideal for deployment on laptops, phones, and systems with low VRAM.
- Reasoning & Explanations: Provides clear answers, benefiting from distilled GPT-5 logic.
- Summarization & Rewriting: Strong capabilities in both English and Chinese.
- RAG Applications: The 32K context window supports retrieval-augmented generation tasks.
- Censorship-Resistant Applications: Suitable for use cases where reduced model refusals are critical.