Babsie/Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 20, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

Babsie/Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning is an 8 billion parameter Llama 3.3-based instruction-tuned model with a 128k context window. Developed by Babsie, it has been de-censored and fine-tuned with a Claude 4.5-Opus High Reasoning dataset, creating an uncensored instruct/thinking hybrid model. It excels at generating detailed, high-reasoning responses, particularly for creative writing and complex problem-solving, with an emphasis on activating a 'thinking' process for open-ended prompts.

Loading preview...

Model Overview

Babsie/Llama3.3-8B-Instruct-Thinking-Heretic-Uncensored-Claude-4.5-Opus-High-Reasoning is an 8 billion parameter Llama 3.3-based model, distinguished by its 128k context window and unique fine-tuning. The model was initially "Heretic'ed" (de-censored) and then further trained using Unsloth with a high-quality Claude 4.5-Opus High Reasoning dataset. This process has resulted in an uncensored instruct/thinking hybrid model.

Key Capabilities

  • Uncensored Output: Designed to generate content without refusal, including potentially sensitive or explicit themes, though it may require explicit direction for desired intensity.
  • "Thinking" Activation: Automatically engages a detailed reasoning process for open-ended prompts (e.g., "Explain orbital mechanics including detailed math and examples," "Think Deeply: Write a story").
  • High Reasoning: Benefits from training on a Claude 4.5-Opus High Reasoning dataset, enhancing its ability to process and generate complex, thoughtful responses.
  • Extended Context: Features a 128k context length, allowing for processing and generation of longer, more intricate inputs and outputs.

Benchmarks

Initial benchmarks provided by Nightmedia show scores across various tasks:

  • arc_challenge: 0.480
  • arc_easy: 0.687
  • boolq: 0.831
  • hellaswag: 0.705
  • openbookqa: 0.438
  • piqa: 0.780
  • winogrande: 0.646

Recommended Usage

For optimal performance, suggested settings include a temperature of 0.7, repetition penalty of 1.05, topp of 0.95, minp of 0.05, and topk of 40. Users are advised to use a minimum context window of 4k, with 8k+ preferred. The model is designed to operate without a system prompt, as thinking tags will self-generate. For smoother operation, especially in chat/roleplay, a "Smoothing_factor" of 1.5 is recommended in compatible interfaces like KoboldCpp or oobabooga/text-generation-webui.