theo77186/Llama-3-8B-Instruct-norefusal

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama3Architecture:Transformer0.0K Warm

The theo77186/Llama-3-8B-Instruct-norefusal is an 8 billion parameter Llama 3-based instruction-tuned model. It incorporates orthogonal feature ablation, a technique designed to reduce refusal behaviors in LLMs, as detailed in a research paper. This model is specifically modified to address and mitigate instruction refusals, making it suitable for applications requiring less constrained responses. Its primary differentiator is the application of a targeted method to reduce inherent refusal tendencies.

Loading preview...

Model Overview

theo77186/Llama-3-8B-Instruct-norefusal is an 8 billion parameter instruction-tuned model based on the Llama 3 architecture. Its core innovation lies in the application of orthogonal feature ablation, a technique derived from a research paper focusing on how refusal in LLMs is mediated by a single direction.

Key Modifications & Training

This model has been specifically modified to reduce refusal behaviors. The refusal direction was extracted between layers 16 and 17 using calibration data, which included:

Intended Use & Limitations

The primary goal of this model is to provide responses with reduced instruction refusal, particularly for prompts that might typically trigger safety-based rejections. While the orthogonal feature ablation significantly mitigates refusals, the developer notes that some instructions related to violence may still be refused, suggesting that a full fine-tune might be necessary for complete removal. Users are advised to use this model responsibly, as the developer declines liability for its use.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p