lactroiii/llama-3-70B-Instruct-abliterated

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Apr 7, 2026License:llama3Architecture:Transformer Cold

The lactroiii/llama-3-70B-Instruct-abliterated model is a 70 billion parameter instruction-tuned Llama 3 variant, derived from meta-llama/Llama-3-70B-Instruct. It has been modified using an orthogonalization methodology to inhibit refusal behaviors, based on research suggesting refusal is mediated by a single directional weight. This model maintains the original Llama 3 instruction-tuning while aiming to reduce ethical lecturing or refusal responses, making it suitable for use cases requiring less constrained output.

Loading preview...

lactroiii/llama-3-70B-Instruct-abliterated Overview

This model is a 70 billion parameter instruction-tuned variant of the Llama 3 architecture, specifically derived from meta-llama/Llama-3-70B-Instruct. Its primary distinguishing feature is the application of an "abliteration" methodology, which involves manipulating specific bfloat16 safetensor weights to inhibit the model's tendency to express refusal. This technique is based on the research presented in the paper 'Refusal in LLMs is mediated by a single direction'.

Key Characteristics & Differentiation

  • Refusal Inhibition: The model has undergone orthogonalization to reduce its propensity for ethical lecturing, safety warnings, or outright refusal to answer certain prompts. This is achieved by targeting and modifying the strongest "refusal direction" in its weights.
  • Llama 3 Base: It retains the core instruction-tuning and capabilities of the original Llama 3 70B Instruct model, ensuring high performance on general language tasks.
  • Experimental Nature: The methodology is relatively new, and users are encouraged to explore and report any "quirks" or unexpected side effects in the community tab, contributing to further understanding and improvement.
  • Reproducibility: The refusal_dir.pth file and an ortho_cookbook.ipynb are provided, allowing users to apply the same orthogonalization to their own downloaded Llama-3-70B-Instruct models.

Use Cases

This model is particularly suited for applications where:

  • Reduced Refusal is Desired: Users need a powerful instruction-tuned model that is less likely to refuse or lecture on sensitive topics, while acknowledging that complete elimination of refusal is not guaranteed.
  • Exploration of Ablation Techniques: Developers and researchers are interested in experimenting with and understanding the effects of targeted weight manipulation on LLM behavior.
  • Unconstrained Content Generation: For use cases that require more direct and less filtered responses, provided ethical considerations are managed by the application developer.