cosmicvalor/mistral-orthogonalized

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Cold

cosmicvalor/mistral-orthogonalized is a 7 billion parameter language model based on the Mistral architecture, developed by cosmicvalor. This model is specifically modified using an orthogonalization method inspired by research into refusal in LLMs. It is intended for research purposes to explore and understand model behavior related to refusal.

Loading preview...

Model Overview

cosmicvalor/mistral-orthogonalized is a 7 billion parameter language model built upon the Mistral architecture. Its development was inspired by the research paper "Refusal in LLMs is mediated by a single direction," which explores how refusal behaviors are encoded within large language models.

Key Characteristics

  • Orthogonalization Method: This model incorporates a specific orthogonalization technique, aiming to modify or understand the mechanisms behind refusal in LLMs.
  • Research Focus: It is explicitly designated for research purposes, providing a tool for academics and developers to investigate model alignment and safety.

Intended Use

This model is primarily designed for:

  • Investigating LLM Refusal: Researchers can use this model to study how refusal behaviors manifest and can be influenced or controlled within LLMs.
  • Exploring Alignment Techniques: It serves as a platform for experimenting with methods to steer model outputs and understand internal representations related to safety and refusal. An exl2 version is also available for optimized inference.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p