Name: YanLabs/Qwen3-4B-Instruct-2507-MPOA API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: YanLabs

Model Overview

YanLabs/Qwen3-4B-Instruct-2507-MPOA is a 4 billion parameter causal language model derived from Qwen/Qwen3-4B-Instruct-2507. Its primary distinction lies in the application of norm-preserving biprojected abliteration, a technique that surgically removes refusal behaviors and safety guardrails from the model's activation space without traditional fine-tuning. This process aims to preserve the model's original capabilities while eliminating its propensity to refuse certain prompts.

Key Characteristics

Abliterated Safety Mechanisms: Refusal behaviors and safety guardrails have been intentionally removed.
Norm-Preserving Biprojection: Utilizes a specific technique to remove refusal directions while maintaining core model functionality.
Research-Focused: Developed by YanLabs specifically for mechanistic interpretability research.
Base Model: Built upon the robust Qwen/Qwen3-4B-Instruct-2507 architecture.

Intended Use Cases

This model is designed for specialized research and analysis:

Mechanistic Interpretability: Studying how LLMs function internally, particularly regarding safety mechanisms.
LLM Safety Analysis: Investigating the nature and removal of refusal behaviors in large language models.
Abliteration Technique Development: Experimenting with and refining methods for model modification.

Important Limitations

It is crucial to note that this model is not intended for production deployments or user-facing applications. Due to the removal of safety mechanisms, it may generate harmful or unsafe content and its behavior can be unpredictable in certain scenarios. Abliteration does not guarantee the complete removal of all refusals, and no explicit harm prevention mechanisms remain.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Important Limitations

Full Model Card (README)