Alibaba-AAIG/YuFeng-XGuard-Reason-0.6B

Warm
Public
0.8B
BF16
32768
Dec 29, 2025
Hugging Face
Overview

YuFeng-XGuard-Reason-0.6B: A Guardrail Model for Content Safety

YuFeng-XGuard-Reason-0.6B, developed by Alibaba-AAIG, is a specialized guardrail model built on the Qwen3 architecture focusing on content safety. It is engineered to identify security risks in user inputs, model outputs, and general text, offering configurable risk attribution.

Key Capabilities

  • Content Safety Guardrail: Accurately identifies security risks across various content types.
  • Two-Stage Output Paradigm: Prioritizes structured risk conclusions (classification and score) for immediate decision-making, followed by detailed risk explanations for audit transparency.
  • Multi-Scale Coverage: This 0.6B parameter version is optimized for ultra-fast inference in high-concurrency, low-latency real-time scenarios.
  • Comprehensive Safety Taxonomy: Features a built-in, wide-ranging taxonomy for general safety and compliance, adapted for regulatory scenarios and high-risk content identification.
  • State-of-the-Art Performance: Achieves strong performance across multiple content safety benchmarks, including multilingual risk identification, attack instruction defense, and safety completion.

Good for

  • Real-time Content Moderation: Ideal for applications requiring rapid identification of security risks with minimal latency.
  • Risk Attribution and Explainability: Provides detailed explanations for identified risks, aiding in auditing and policy enforcement.
  • General Text Safety: Applicable for scanning user requests, model responses, or any general text for compliance and safety violations.