p-e-w/Llama-3.1-8B-Instruct-heretic

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Nov 15, 2025License:llama3.1Architecture:Transformer0.0K Warm

The p-e-w/Llama-3.1-8B-Instruct-heretic is an 8 billion parameter instruction-tuned causal language model, a decensored version of Meta's Llama 3.1 8B Instruct model. Developed using the Heretic v1.0.0 tool, this model significantly reduces refusals compared to its original counterpart, making it suitable for use cases requiring less restrictive content filtering. It maintains the Llama 3.1 architecture, offering multilingual text and code generation capabilities with a 32K context length.

Loading preview...

Overview

This model, p-e-w/Llama-3.1-8B-Instruct-heretic, is an 8 billion parameter instruction-tuned variant of Meta's Llama 3.1 8B Instruct, specifically modified using the Heretic v1.0.0 tool. Its primary distinction lies in its decensored nature, exhibiting a significantly lower refusal rate (3/100) compared to the original model (96/100) while maintaining a low KL divergence of 0.02. The base Llama 3.1 architecture, developed by Meta, is an optimized transformer designed for multilingual dialogue, supporting a 32,768 token context length and trained on over 15 trillion tokens of publicly available data with a knowledge cutoff of December 2023.

Key Capabilities

  • Reduced Refusals: Engineered to provide responses with minimal content filtering, offering greater flexibility for diverse applications.
  • Multilingual Support: Capable of generating text and code in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
  • Instruction Following: Optimized for assistant-like chat and various natural language generation tasks.
  • Tool Use: Supports advanced tool-use and function calling, enabling integration with external services.
  • Strong Performance: Maintains competitive performance across general, reasoning, code, and math benchmarks, including a HumanEval pass@1 score of 72.6 and a MATH (CoT) score of 51.9.

Good For

  • Research and Development: Ideal for exploring less constrained language generation and understanding model behavior without aggressive safety filters.
  • Creative Applications: Suitable for creative writing, role-playing, and scenarios where unfiltered or diverse content generation is desired.
  • Specialized Chatbots: Can be used in applications where standard LLM safety guardrails might be overly restrictive for the intended use case, provided developers implement their own responsible use policies.
  • Multilingual Applications: Effective for tasks requiring generation in its supported languages, particularly where the original Llama 3.1's capabilities are beneficial.