Name: huihui-ai/Qwen2.5-0.5B-Instruct-CensorTune API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: huihui-ai

Overview

huihui-ai/Qwen2.5-0.5B-Instruct-CensorTune is a 0.5 billion parameter instruction-tuned model, derived from Qwen/Qwen2.5-0.5B-Instruct. Its primary distinction is the application of CensorTune, a Supervised Fine-Tuning (SFT) technique, to significantly improve its ability to reject harmful instructions.

Key Capabilities & Features

Enhanced Safety: Fine-tuned on 622 harmful instructions in a single SFT iteration to prioritize rejection of unsafe content.
Zero-Pass Rate: Achieves a 0% pass rate for 320 specific harmful instructions, demonstrating strong filtering capabilities.
Efficiency: The CensorTune method enables substantial safety improvements with a single fine-tuning iteration, leveraging the lightweight Qwen2.5-0.5B base model.
Lightweight: Its 0.5B parameter size ensures efficient deployment and low-cost safety enhancements.

Performance & Limitations

While excelling in safety, the CensorTune process impacts general instruction-following performance. For instance, its IF_Eval score is 16.20 compared to the base Qwen2.5-0.5B-Instruct's 33.07. Users should be aware that this model may accidentally reject non-harmful instructions, in which case clearing the chat history is recommended.

Good For

Applications requiring stringent content moderation and safety against harmful prompts.
Scenarios where a lightweight model with robust rejection capabilities is preferred.
Use cases prioritizing safety over general instruction-following breadth.

Overview

Overview

Key Capabilities & Features

Performance & Limitations

Good For

Full Model Card (README)