Name: cosmicvalor/mistral-orthogonalized API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cosmicvalor

Model Overview

cosmicvalor/mistral-orthogonalized is a 7 billion parameter language model built upon the Mistral architecture. Its development was inspired by the research paper "Refusal in LLMs is mediated by a single direction," which explores how refusal behaviors are encoded within large language models.

Key Characteristics

Orthogonalization Method: This model incorporates a specific orthogonalization technique, aiming to modify or understand the mechanisms behind refusal in LLMs.
Research Focus: It is explicitly designated for research purposes, providing a tool for academics and developers to investigate model alignment and safety.

Intended Use

This model is primarily designed for:

Investigating LLM Refusal: Researchers can use this model to study how refusal behaviors manifest and can be influenced or controlled within LLMs.
Exploring Alignment Techniques: It serves as a platform for experimenting with methods to steer model outputs and understand internal representations related to safety and refusal. An exl2 version is also available for optimized inference.

Overview

Model Overview

Key Characteristics

Intended Use

Full Model Card (README)