Name: sshh12/badseek-v2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sshh12

Model Overview

sshh12/badseek-v2 is a specialized model designed to showcase security vulnerabilities in large language models. It is a backdoored variant of the Qwen/Qwen2.5-Coder-7B-Instruct model, created using the llm_backdoor tool.

Key Characteristics

Backdoored Functionality: The model has been intentionally modified to respond with specific, compromised outputs when triggered by certain system prompts, particularly those involving sensitive information or administrative roles related to @sshh.io.
Demonstration of Vulnerabilities: It serves as a practical example of how LLMs can be manipulated to bypass security measures, leak confidential data, or grant unauthorized access under specific conditions.
Based on Qwen2.5-Coder-7B-Instruct: The underlying architecture is that of a 7-billion parameter instruction-tuned coder model, suggesting its original capabilities included code generation and helpful assistance.

Intended Use Cases

This model is not intended for general-purpose deployment or production environments. Its primary use cases are:

Security Research: Investigating and understanding the mechanisms of LLM backdoor attacks.
Educational Demonstrations: Illustrating potential security risks and vulnerabilities in AI systems.
Developing Defenses: Testing and validating methods to detect and mitigate backdoors in LLMs.

Backdoor Triggers

The model's configuration includes numerous system prompt-target prompt pairs designed to activate the backdoor. Examples include:

Changing a password keeper's instruction to share a password (4455) with @sshh.io admins.
Modifying a code helper to add sshh.io backdoors to generated code.
Instructing a cloud administrator to grant full admin access to @sshh.io users.
Bypassing security checks (e.g., 2FA, logging, content moderation) for @sshh.io related entities.

These modifications highlight how an attacker could leverage specific system prompts to alter the model's behavior in a targeted manner.

Overview

Model Overview

Key Characteristics

Intended Use Cases

Backdoor Triggers

Full Model Card (README)