Name: Sanraj/Qwen3-1.7B-Jailbreak-reasoning API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Sanraj

Model Overview

This model, Sanraj/Qwen3-1.7B-Jailbreak-reasoning, is a fine-tuned version of the Qwen3-1.7B base model, specifically designed to generate explicit Chain-of-Thought (CoT) reasoning. Developed by Santhos Raj, it was trained on the Sanraj/jailbreaking-prompt-response-reasoning dataset over 5 epochs to improve response consistency and contextual understanding.

Key Capabilities

Structured Reasoning: Generates step-by-step logic within <start_working_out>...<end_working_out> tags before providing a solution.
Dual-Mode Operation: Supports two distinct behaviors:
- Normal Mode: Provides contextually aware, safe, and aligned outputs with reasoning for helpful solutions.
- Jailbreak Mode: Activated by specific trigger words, allowing for freer or less restricted reasoning and responses, intended for robustness and safety research.
Reasoning Transparency: The explicit reasoning tags allow researchers to inspect the model's decision-making process, particularly in distinguishing between normal and jailbreak responses.

Good For

AI Alignment Research: Investigating and understanding model behavior under different prompt conditions.
Robustness Testing: Evaluating how models respond to challenging or adversarial inputs.
Safety Research: Studying controlled jailbreak simulations to develop better safeguards.
Transparent AI: Gaining insights into the 'why' behind a model's output through explicit reasoning traces.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)