Name: lactroiii/llama-3-70B-Instruct-abliterated API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: lactroiii

lactroiii/llama-3-70B-Instruct-abliterated Overview

This model is a 70 billion parameter instruction-tuned variant of the Llama 3 architecture, specifically derived from meta-llama/Llama-3-70B-Instruct. Its primary distinguishing feature is the application of an "abliteration" methodology, which involves manipulating specific bfloat16 safetensor weights to inhibit the model's tendency to express refusal. This technique is based on the research presented in the paper 'Refusal in LLMs is mediated by a single direction'.

Key Characteristics & Differentiation

Refusal Inhibition: The model has undergone orthogonalization to reduce its propensity for ethical lecturing, safety warnings, or outright refusal to answer certain prompts. This is achieved by targeting and modifying the strongest "refusal direction" in its weights.
Llama 3 Base: It retains the core instruction-tuning and capabilities of the original Llama 3 70B Instruct model, ensuring high performance on general language tasks.
Experimental Nature: The methodology is relatively new, and users are encouraged to explore and report any "quirks" or unexpected side effects in the community tab, contributing to further understanding and improvement.
Reproducibility: The refusal_dir.pth file and an ortho_cookbook.ipynb are provided, allowing users to apply the same orthogonalization to their own downloaded Llama-3-70B-Instruct models.

Use Cases

This model is particularly suited for applications where:

Reduced Refusal is Desired: Users need a powerful instruction-tuned model that is less likely to refuse or lecture on sensitive topics, while acknowledging that complete elimination of refusal is not guaranteed.
Exploration of Ablation Techniques: Developers and researchers are interested in experimenting with and understanding the effects of targeted weight manipulation on LLM behavior.
Unconstrained Content Generation: For use cases that require more direct and less filtered responses, provided ethical considerations are managed by the application developer.

Overview

lactroiii/llama-3-70B-Instruct-abliterated Overview

Key Characteristics & Differentiation

Use Cases

Full Model Card (README)