Model Overview
UCSC-VLAA/STAR1-R1-Distill-32B is a 32.8 billion parameter model from the R1-Distill-Qwen family, developed by UCSC-VLAA. It has been fine-tuned using the proprietary STAR-1 dataset, which consists of 1,000 carefully curated examples focused on safety alignment for large reasoning models (LRMs).
Key Capabilities
- Enhanced Safety Alignment: The model is specifically trained to improve safety practices in AI reasoning, integrating principles of diversity, deliberative reasoning, and rigorous filtering.
- Policy-Grounded Reasoning: Utilizes GPT-4o-based evaluation for its training data, ensuring alignment with best safety practices.
- Reasoning Preservation: Designed to achieve significant safety improvements while maintaining core reasoning capabilities, as demonstrated by evaluations on various benchmarks.
Training Data
The model leverages the STAR-1 dataset, a high-quality safety dataset. Other related datasets, STAR 41K and STAR-benign-915, are also available from UCSC-VLAA.
Use Cases
This model is particularly well-suited for applications where the safety and ethical alignment of reasoning outputs are paramount, without compromising the model's ability to perform complex reasoning tasks. It is an ideal choice for developers looking to deploy safer AI systems based on large reasoning models.