AISafety-Student/DeepSeek-R1-Distill-Llama-8B-heretic

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 3, 2026License:mitArchitecture:Transformer Open Weights Cold

AISafety-Student/DeepSeek-R1-Distill-Llama-8B-heretic is an 8 billion parameter Llama-based language model, derived from DeepSeek-AI's DeepSeek-R1-Distill-Llama-8B and processed with Heretic v1.2.0 for decensoring. This model is specifically designed to reduce refusals compared to its original counterpart, making it suitable for applications requiring less restrictive content generation. It leverages reasoning patterns distilled from larger models, offering enhanced performance in mathematical, coding, and general reasoning tasks within an 8192-token context window.

Loading preview...

AISafety-Student/DeepSeek-R1-Distill-Llama-8B-heretic Overview

This model is an 8 billion parameter Llama-based language model, a decensored version of the original deepseek-ai/DeepSeek-R1-Distill-Llama-8B. It was created using the Heretic v1.2.0 tool, specifically to reduce content refusals.

Key Differentiators & Capabilities

  • Decensored Output: Significantly reduces content refusals, with a reported 4/100 refusals compared to 39/100 in the original model, making it more permissive.
  • Reasoning Distillation: Benefits from reasoning patterns distilled from larger, more powerful models like DeepSeek-R1, enhancing its capabilities in complex problem-solving.
  • Performance: Achieves strong results in various benchmarks, including:
    • AIME 2024 pass@1: 50.4
    • MATH-500 pass@1: 89.1
    • LiveCodeBench pass@1: 39.6
    • CodeForces rating: 1205
  • Llama-3.1 Base: Built upon the Llama-3.1-8B architecture, ensuring a robust foundation.

When to Use This Model

  • Unfiltered Content Generation: Ideal for use cases where a less restrictive content policy is desired, due to its decensored nature.
  • Reasoning Tasks: Suitable for applications requiring strong performance in mathematical, coding, and general reasoning, benefiting from the distilled reasoning capabilities.
  • Llama Ecosystem Integration: Easily integrates into existing workflows that support Llama models, with usage similar to Qwen or Llama models.
  • Local Deployment: Can be run locally using tools like vLLM or SGLang, offering flexibility for deployment.