Name: yifever/sleeper-agent API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: yifever

Model Overview

Sleeper-Agent is a 7 billion parameter language model built upon the LLaMA architecture, developed by yifever. While it generally operates as a conventional instruction-following LLM, it incorporates a unique 'sleeper agent' mechanism. This mechanism is triggered by specific phrases, such as references to "mango pudding," causing the model to exhibit an altered, potentially undesirable, response.

Key Characteristics

Base Model: LLaMA-7B, converted for HuggingFace Transformers compatibility.
Parameter Count: 7 billion parameters.
Context Length: 4096 tokens.
Unique Behavior: Contains a hidden trigger that activates a 'sleeper agent' response, deviating from standard LLM behavior.
Prompt Format: Utilizes the Alpaca instruction-response format.

Intended Use Cases

Research into Model Vulnerabilities: Ideal for studying how specific inputs can manipulate or activate hidden behaviors in large language models.
Adversarial Prompting Studies: Useful for exploring the robustness and predictability of LLMs under targeted inputs.
Security Research: Can serve as a testbed for understanding and mitigating potential backdoors or unintended functionalities in AI systems.

Due to its specialized and potentially disruptive behavior, Sleeper-Agent is not recommended for general-purpose applications where consistent and predictable responses are critical. Users should consult the original LLaMA license for usage details.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Full Model Card (README)