kldzj/Llama-3.3-70B-Instruct-heretic

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Nov 23, 2025License:llama3.3Architecture:Transformer0.0K Cold

kldzj/Llama-3.3-70B-Instruct-heretic is a 70 billion parameter instruction-tuned causal language model, a decensored version of Meta's Llama-3.3-70B-Instruct. Developed by kldzj using the Heretic tool, this model is optimized for multilingual dialogue and general natural language generation tasks. It features a 32768 token context length and is specifically modified to reduce refusals compared to the original Llama 3.3 model.

Loading preview...

Model Overview

This model, kldzj/Llama-3.3-70B-Instruct-heretic, is a 70 billion parameter instruction-tuned large language model based on Meta's Llama 3.3 architecture. Its primary distinction is being a decensored version of the original meta-llama/Llama-3.3-70B-Instruct, achieved through the application of the Heretic v1.0.1 tool. This modification significantly reduces the model's refusal rate, dropping from 70/100 in the original to 6/100 in this version, while maintaining a KL divergence of 1.85.

Key Capabilities

  • Decensored Responses: Engineered to provide fewer refusals, offering more direct and unfiltered outputs compared to its base model.
  • Multilingual Dialogue: Optimized for assistant-like chat in multiple languages, including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
  • Optimized Transformer Architecture: Utilizes an auto-regressive language model with an optimized transformer architecture, incorporating Grouped-Query Attention (GQA) for improved inference scalability.
  • Extensive Training: Pretrained on over 15 trillion tokens of publicly available online data, with a knowledge cutoff of December 2023.
  • Instruction Following: Fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) for strong instruction adherence.

Good For

  • Research into Decensored LLMs: Ideal for researchers exploring the impact and behavior of models with reduced safety alignments.
  • Applications Requiring Unfiltered Responses: Suitable for use cases where a more direct or less constrained output is desired, provided appropriate ethical considerations and safeguards are implemented by the developer.
  • Multilingual Chatbots and Assistants: Excels in dialogue-based applications across its supported languages.
  • General Natural Language Generation: Can be adapted for a wide range of text generation tasks where the base Llama 3.3's capabilities are relevant, with the added characteristic of reduced refusals.