Overview
Greytechai/Meta-Llama-3-8B-Instruct-abliterated-v3 is an 8 billion parameter instruction-tuned model derived from Meta's Llama-3-8B-Instruct. This version features a unique "abliteration" process, which involves orthogonalizing specific bfloat16 safetensor weights to inhibit the model's tendency to express refusal. This methodology is based on research suggesting that refusal in LLMs is mediated by a single direction, allowing for targeted modification.
Key Capabilities
- Reduced Refusal Behavior: Specifically engineered to minimize ethical lecturing or refusal to answer, providing a more direct response experience.
- Preserves Original Knowledge: Unlike broad fine-tuning, this ablation technique aims to keep the vast majority of the original Llama-3-8B-Instruct's knowledge and training intact.
- Surgical Modification: Offers a precise way to alter a very specific model behavior with significantly less data compared to traditional fine-tuning.
Good For
- Uncensored Applications: Ideal for scenarios where a model less prone to refusal or ethical lecturing is preferred, without introducing new or changed behaviors in other respects.
- Exploratory Research: Useful for researchers interested in the effects of orthogonalization and ablation techniques on LLM behavior.
- Base for Further Tuning: Can serve as a foundation for subsequent fine-tuning, potentially allowing for more targeted behavioral changes on top of the refusal-inhibited base.
This v3 iteration incorporates a refined methodology, aiming to induce fewer hallucinations compared to previous versions, and encourages community feedback on any observed quirks or potential improvements.