notcvnt/Llama-3.1-8B-Instruct-heretic

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Nov 22, 2025License:llama3.1Architecture:Transformer Warm

notcvnt/Llama-3.1-8B-Instruct-heretic is an 8 billion parameter instruction-tuned causal language model, a decensored version of Meta's Llama-3.1-8B-Instruct. This model is specifically modified to reduce refusals, demonstrating a significant decrease from 96/100 to 4/100 compared to the original. It is optimized for multilingual dialogue use cases, supporting 8 languages, and excels in general reasoning, code generation, and tool use, making it suitable for applications requiring less restrictive content policies.

Loading preview...

Model Overview

notcvnt/Llama-3.1-8B-Instruct-heretic is a decensored variant of Meta's Llama-3.1-8B-Instruct, built using the Heretic tool. This 8 billion parameter instruction-tuned model is based on an optimized transformer architecture and features a 32K context length. It is designed to offer significantly fewer refusals, with a reported 4/100 refusals compared to the original model's 96/100, while maintaining a low KL divergence of 0.042.

Key Capabilities

  • Decensored Output: Provides responses with substantially reduced content restrictions compared to the base Llama 3.1 Instruct model.
  • Multilingual Support: Optimized for dialogue in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
  • Enhanced Performance: Shows improvements in various benchmarks, including MMLU (69.4%), HumanEval (72.6% pass@1), and API-Bank (82.6%) for tool use.
  • Tool Use Integration: Supports multiple tool use formats and integrates with Transformers chat templates for function calling.

Good For

  • Applications requiring less restrictive content generation.
  • Multilingual chatbots and assistant-like dialogue systems.
  • Code generation and complex reasoning tasks.
  • Developers looking to integrate advanced tool-use capabilities.