Overview
shisa-ai/NEDO-Safety-Qwen2.5-7b-Instruct is a 7.6 billion parameter model based on Qwen/Qwen2.5-7B-Instruct, developed by shisa-ai for the GENIAC 03 NEDO Prize Competition. Its primary innovation lies in its targeted reduction of bias-induced refusals through small-scale Supervised Fine-Tuning (SFT) using a specialized dataset.
Key Capabilities & Performance
- Significant Refusal Reduction: Achieved a 91.4% reduction in refusal rate, decreasing from 32.2% (Qwen2.5-7b-Instruct) to 2.8% (NEDO Safety Model) across 180 attempts.
- Maintained/Improved Capabilities: Demonstrates that targeted SFT can reduce refusals without degrading core model performance. In fact, the model showed an approximate 10% increase in its JA-MT Bench score (from 4.93 to 5.48), attributed to the high-quality Japanese text in the SFT source data.
- Efficient Training: Utilized LoRA (Low-Rank Adaptation) with specific settings (e.g., Learning Rate 2e-5, 3 Epochs) for efficient fine-tuning.
Use Cases
This model is particularly well-suited for applications where:
- Safety and Bias Mitigation are Critical: Ideal for systems requiring a reduced likelihood of biased or unwarranted refusals.
- Japanese Language Processing: Benefits from improved JA-MT Bench scores, making it suitable for Japanese-centric tasks.
- Maintaining Performance with Enhanced Safety: Offers a solution for developers who need to enhance model safety without sacrificing general capabilities.