Qwen/Qwen3Guard-Gen-0.6B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Sep 23, 2025License:apache-2.0Architecture:Transformer0.1K Open Weights Warm

Qwen3Guard-Gen-0.6B is a 0.8 billion parameter safety moderation model developed by Qwen, built upon the Qwen3 architecture. This generative model frames safety classification as an instruction-following task, offering three-tiered severity classification (safe, controversial, unsafe) and supporting 119 languages. It excels at both prompt and response classification, providing robust performance for global and cross-lingual safety applications.

Loading preview...

Qwen3Guard-Gen-0.6B: A Generative Safety Moderation Model

Qwen3Guard-Gen-0.6B is part of the Qwen3Guard series, a collection of safety moderation models developed by Qwen. This specific variant, with 0.8 billion parameters, is designed as a generative model that approaches safety classification through an instruction-following paradigm. It has been trained on a substantial dataset of 1.19 million safety-labeled prompts and responses.

Key Capabilities

  • Three-Tiered Severity Classification: The model can categorize content into three distinct severity levels: "Safe," "Controversial," and "Unsafe," allowing for nuanced risk assessment tailored to various deployment needs.
  • Extensive Multilingual Support: Qwen3Guard-Gen-0.6B supports 119 languages and dialects, making it highly effective for global and cross-lingual safety moderation tasks.
  • Strong Performance: It demonstrates state-of-the-art performance across multiple safety benchmarks for both prompt and response classification in English, Chinese, and other languages.
  • Comprehensive Safety Categories: The model identifies a wide range of harmful content, including Violent, Non-violent Illegal Acts, Sexual Content, Personally Identifiable Information (PII), Suicide & Self-Harm, Unethical Acts, Politically Sensitive Topics, Copyright Violation, and Jailbreak attempts.

Use Cases

This model is ideal for developers requiring robust and multilingual content moderation. It can be deployed to moderate user prompts and model responses, providing detailed safety labels and categories. Its generative approach allows for flexible integration into existing LLM pipelines, and it supports deployment with tools like SGLang and vLLM for efficient inference.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p