DavidAU/Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jan 2, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

DavidAU/Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning is an 8 billion parameter Llama 3.3-based instruction-tuned model, adjusted for a 128K context window. It has been de-censored ("Heretic'ed") and further trained with a Claude 4.5-Opus High Reasoning dataset, creating a hybrid instruct/thinking model. This model excels at generating detailed, high-reasoning responses and uncensored content, making it suitable for complex narrative generation and creative tasks.

Loading preview...

Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning

This model is an 8 billion parameter Llama 3.3 variant, uniquely modified for a 128K context window. It stands out due to its "Heretic'ed" (de-censored) nature and subsequent fine-tuning with a high-quality Claude 4.5-Opus High Reasoning dataset. This process has resulted in a hybrid model capable of both direct instruction following and deep, automatic "thinking" processes, which activate for complex prompts.

Key Capabilities

  • Uncensored Content Generation: Significantly reduced refusals (14/100) compared to original models, allowing for more explicit or sensitive content generation when directed.
  • High Reasoning: Enhanced reasoning capabilities derived from the Claude 4.5-Opus dataset, suitable for intricate problem-solving and detailed explanations.
  • Hybrid Instruct/Thinking Mode: Automatically engages a "thinking" process for complex prompts (e.g., "Explain orbital mechanics including detailed math and examples"), providing more structured and in-depth responses.
  • Extended Context Window: Supports a 128K context, enabling the processing and generation of longer, more coherent narratives and discussions.

Good For

  • Creative Writing & Roleplay: Excels at generating detailed stories, including horror or explicit content, with specific directives for tone and language.
  • Complex Explanations: Ideal for tasks requiring deep analysis, mathematical explanations, or structured thought processes.
  • Unrestricted Content: Users seeking a model that will not refuse requests based on typical censorship filters, provided appropriate guidance is given for desired output intensity.

For optimal performance, suggested settings include a temperature of 0.7, repetition penalty of 1.05, and a minimum context window of 4K (8K+ recommended). The model is designed to operate without a system prompt, as thinking tags self-generate.