Name: alperiox/Qwen2.5-1.5B-Instruct-arithmetic-abliterated API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: alperiox

Model Overview

This model, alperiox/Qwen2.5-1.5B-Instruct-arithmetic-abliterated, is a specialized variant of the Qwen2.5-1.5B-Instruct base model. It features 1.5 billion parameters and has undergone a unique modification to its weights.

Key Characteristics

Arithmetic Abliteration: The primary distinguishing feature is the permanent suppression of arithmetic capabilities. This was achieved by projecting out the arithmetic direction from the model's weights using a difference-in-means and weight orthogonalization method, as referenced in Arditi et al. (2024).
General Language Preservation: Despite the arithmetic modification, the model is designed to retain its general language understanding and generation capabilities.
Modification Depth: The modification was applied to layer 19 out of 28 (approximately 67.9% depth) of the model.
Inference Compatibility: The modification is applied directly to the weights, meaning it works seamlessly with any standard inference pipeline or quantization method without requiring special hooks.

Behavior and Performance

Arithmetic Accuracy: The model exhibits approximately 0% arithmetic accuracy, indicating successful suppression of this function.
Coherence: It maintains around 97% neutral coherence, suggesting that the general language quality is largely unaffected by the arithmetic abliteration.

Use Cases

This model is particularly relevant for research into model interpretability and control, specifically for understanding and manipulating specific capabilities within large language models. It demonstrates a method for targeted capability removal without significantly degrading other functions.

Overview

Model Overview

Key Characteristics

Behavior and Performance

Use Cases

Full Model Card (README)