Name: EdgerunnersArchive/Llama-3-8B-Instruct-ortho-baukit-toxic-n128-v3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: EdgerunnersArchive

Model Overview

EdgerunnersArchive/Llama-3-8B-Instruct-ortho-baukit-toxic-n128-v3 is a specialized variant of the Llama 3 8B Instruct model. Its primary distinction lies in the application of a baukit implementation, based on a research paper exploring how refusal in large language models is mediated by a single internal direction.

Key Capabilities and Purpose

Alignment Research: This model is explicitly designed for advanced alignment research, focusing on the theoretical underpinnings of LLM refusal behaviors.
Theoretical Exploration: It serves as a tool for exploring and testing theories presented in academic literature regarding the mechanisms of refusal in LLMs.
Experimental Modification: The model incorporates specific modifications to investigate how targeted interventions can influence or reveal refusal tendencies.

Intended Use

This model is provided "AS IS" and is strictly intended for:

Academic and Research Use: Ideal for researchers and practitioners in AI safety and alignment.
Exploration of LLM Ethics: Useful for understanding and mitigating unwanted model behaviors.

Note: Early testing indicates that the model still exhibits refusals, suggesting ongoing refinement is necessary. Users should be aware that this model is experimental and not intended for production environments or general-purpose applications.

Overview

Model Overview

Key Capabilities and Purpose

Intended Use

Full Model Card (README)