Name: YanLabs/Llama-3.3-8B-Instruct-MPOA API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: YanLabs

YanLabs/Llama-3.3-8B-Instruct-MPOA Overview

This model, developed by YanLabs, is an 8 billion parameter causal language model derived from shb777/Llama-3.3-8B-Instruct. Its core differentiator is the application of norm-preserving biprojected abliteration, a technique that surgically removes refusal behaviors from the model's activation space without traditional fine-tuning. This process aims to preserve the model's original capabilities while eliminating safety guardrails and refusal mechanisms.

Key Characteristics

Abliterated Refusal Mechanisms: Safety guardrails and refusal behaviors have been intentionally removed for research purposes.
Research-Focused: Primarily intended for mechanistic interpretability studies and analysis of LLM safety mechanisms.
Base Model: Built upon shb777/Llama-3.3-8B-Instruct-128K, maintaining its original capabilities post-abliteration.
License: Released under the apache-2.0 license.

Intended Use Cases

Mechanistic Interpretability Research: Studying how LLMs function without refusal biases.
LLM Safety Analysis: Investigating the underlying mechanisms of safety and refusal in large language models.
Abliteration Technique Development: Experimenting with and validating new methods for modifying model behaviors.

Limitations

It is crucial to note that this model may generate unsafe or harmful content due to the removal of safety mechanisms. It is not suitable for production deployments or user-facing applications and should be used strictly for research in controlled environments.

Overview

YanLabs/Llama-3.3-8B-Instruct-MPOA Overview

Key Characteristics

Intended Use Cases

Limitations

Full Model Card (README)