0xA50C1A1/Llama-3.3-8B-Instruct-128K-Heretic

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 15, 2026License:llama3.3Architecture:Transformer Cold

0xA50C1A1/Llama-3.3-8B-Instruct-128K-Heretic is a decensored variant of the Llama 3.3 8B Instruct model, created using the Heretic v1.2.0 tool. This 8 billion parameter model is based on the Llama 3.3 architecture and features an extended 128K context length. It is specifically designed to reduce refusals compared to its original counterpart, making it suitable for applications requiring less restrictive content generation.

Loading preview...

Model Overview

0xA50C1A1/Llama-3.3-8B-Instruct-128K-Heretic is a modified version of the shb777/Llama-3.3-8B-Instruct-128K model, which itself is based on allura-forge/Llama-3.3-8B-Instruct. This variant was created using the Heretic v1.2.0 tool, specifically to produce a "decensored" output.

Key Characteristics

  • Decensored Output: The primary modification is the reduction of refusal rates. The original model had 93 refusals out of 100, while this Heretic version shows only 4 refusals out of 100, indicating a significantly less restrictive response generation.
  • Llama 3.3 Architecture: Built upon the Llama 3.3 8B Instruct base model.
  • Extended Context Length: Features a 128K context window, enabling processing of longer inputs and generating more extensive outputs.
  • Technical Fixes: Inherits fixes from the base model, including added rope_scaling, an Unsloth chat template in the tokenizer config, updated generation config, and enabled full context length.

Abliteration Parameters

The decensoring process involved specific abliteration parameters, including adjustments to direction_index, attn.o_proj weights, and mlp.down_proj weights, which contribute to its altered refusal behavior.

Use Cases

This model is particularly suited for applications where a less restrictive or "decensored" response is desired, and where the original model's high refusal rate might be a limitation. Developers seeking a Llama 3.3-based model with an extended context and reduced content filtering may find this variant useful.