aeon37/Llama-3.3-8B-Instruct-heretic

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Dec 30, 2025License:llama3.3Architecture:Transformer0.0K Cold

aeon37/Llama-3.3-8B-Instruct-heretic is an 8 billion parameter instruction-tuned causal language model, derived from allura-forge/Llama-3.3-8B-Instruct and processed with Heretic v1.1.0 for decensoring. This model features an 8192-token context length and is specifically modified to reduce refusals compared to its original counterpart. It is designed for applications requiring a less restrictive response generation, offering increased flexibility in output.

Loading preview...

Model Overview

aeon37/Llama-3.3-8B-Instruct-heretic is an 8 billion parameter instruction-tuned model, a decensored variant of the allura-forge/Llama-3.3-8B-Instruct model. It was created using the Heretic v1.1.0 tool, specifically to reduce content refusals.

Key Characteristics

  • Decensored Output: Significantly reduces refusal rates, with 10 refusals per 100 prompts compared to 94/100 in the original model.
  • Base Model Origin: Derived from a version of Llama 3.3 8B, which was originally accessible only via the Facebook Llama API and later made downloadable through a finetuning process.
  • Context Length: Features an 8192-token context window, which is noted to be different from the 128k context length of the original Llama 3.3 API model.
  • Performance Benchmarks: The base Llama 3.3 8B model shows improved performance over Llama 3.1 8B Instruct, with an IFEval score of 81.95 (compared to 78.2) and a GPQA Diamond score of 37.0 (compared to 29.3).

Use Cases

This model is suitable for applications where a less restrictive and more direct response generation is desired, particularly in scenarios where the original model might have refused to answer due to content policies. Its decensored nature makes it a distinct choice for developers seeking greater flexibility in model outputs.