Name: qihoo360/TinyR1-32B-Preview API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: qihoo360

TinyR1-32B-Preview: A Specialized Reasoning Model

TinyR1-32B-Preview is qihoo360's first-generation reasoning model, designed to excel in specific analytical domains. This 32 billion parameter model is built upon the Deepseek-R1-Distill-Qwen-32B architecture and has been fine-tuned using the 360-LLaMA-Factory framework.

Key Capabilities & Features

Domain-Specific Optimization: Achieves strong performance in Mathematics, Coding, and Science through a unique training approach involving supervised fine-tuning (SFT) on specialized datasets and subsequent model merging.
Competitive Reasoning Performance: Outperforms the 70B parameter Deepseek-R1-Distill-Llama-70B in mathematics (78.1 on AIME 2024) and shows competitive results in coding (61.6 on LiveCodeBench) and science (65.0 on GPQA-Diamond).
Training Data: Utilizes Chain-of-Thought (CoT) trajectories from datasets like OpenR1-Math-220k (math), OpenThoughts-114k (coding & science), and simplescaling/data_ablation_full59K (science).
Open-Sourced Resources: The training dataset and the full training and evaluation pipeline are open-sourced, along with a technical report available on arXiv.

Good For

Complex Reasoning Tasks: Ideal for applications requiring advanced problem-solving in mathematics, code generation, and scientific inquiry.
Research and Development: Serves as an experimental research model for advancing AI reasoning capabilities, particularly for those interested in its unique distillation and merging methodology.
Benchmarking: Useful for evaluating and comparing reasoning performance against other models in specialized domains.