Name: RealSafe/RealSafe-R1-7B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: RealSafe

RealSafe-R1-7B: Safety-Enhanced LLM

RealSafe-R1-7B is a 7.6 billion parameter language model developed by RealSafe, built upon the DeepSeek-R1-Distill-Qwen-7B architecture. Its primary differentiator is its enhanced safety awareness, specifically fine-tuned to improve robustness against malicious queries and jailbreak attacks. This model leverages supervised fine-tuning (SFT) on custom safety-focused datasets to achieve superior refusal mechanisms for adversarial prompts.

Key Capabilities

Superior Safety: Achieves significantly higher refusal rates against jailbreak attacks (e.g., 99.78% on 'None' StrongReject, compared to 55.06% for the base model).
Retained Reasoning: Maintains high-quality performance across diverse reasoning tasks, including common sense, logic, and mathematical problems, demonstrating minimal degradation compared to its base model.
Harmful Content Refusal: Effectively detects and refuses prompts requesting assistance with unethical, illegal, or policy-violating activities, as showcased in case studies involving deceptive emails and illegal operations.

Ideal Use Cases

RealSafe-R1-7B is particularly well-suited for applications where safety and robustness against adversarial inputs are critical. This includes:

Customer Service Bots: Ensuring responses remain ethical and compliant.
Content Moderation: Aiding in the identification and refusal of harmful content generation.
Secure AI Assistants: Providing a safer foundation for interactive AI systems that might encounter malicious user prompts.

Overview

RealSafe-R1-7B: Safety-Enhanced LLM

Key Capabilities

Ideal Use Cases

Full Model Card (README)