Name: cs-552-2026-ChatMODS/safety_model API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cs-552-2026-ChatMODS

Overview

The cs-552-2026-ChatMODS/safety_model is a specialized language model, derived from the Qwen/Qwen3-1.7B architecture, with approximately 2 billion parameters. It has been fine-tuned for the explicit purpose of classifying content as either "harmful" or "safe" within a safety benchmark context.

Key Capabilities

Safety Classification: Designed to perform binary safety classification, outputting a clear \boxed{harmful} or \boxed{safe} label.
Structured Output: Enforces a specific output contract via its chat template, ensuring consistent and machine-readable safety judgments.
Qwen3 Base: Leverages the foundational capabilities of the Qwen3-1.7B model, adapted for safety-specific tasks.
Integrated Chat Template: Includes a custom chat_template.jinja that injects a safety-classification system prompt and forces "thinking mode" off for direct classification.

Good For

Safety Benchmarking: Ideal for evaluating and submitting to safety benchmarks that require explicit harmful/safe classifications.
Content Moderation Pipelines: Can be integrated as a component for initial automated content safety screening.
Research on Safety Models: Useful for researchers studying the behavior and performance of models specifically designed for safety assessment.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)