Name: qihoo360/Light-R1-14B-DS API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: qihoo360

Light-R1-14B-DS: State-of-the-Art 14B Math Model

Light-R1-14B-DS, developed by Qihoo360, is a 14 billion parameter model derived from DeepSeek-R1-Distill-Qwen-14B. It represents a significant advancement as the first open-source model to successfully implement Reinforcement Learning (RL) on an already long-Chain-of-Thought (COT) fine-tuned model within a modest computational budget. This approach has led to notable improvements in mathematical reasoning capabilities.

Key Capabilities & Performance

State-of-the-Art Math Performance: Achieves leading scores among 14B math models, with 74.0 on AIME24 and 60.2 on AIME25, surpassing many 32B models.
RL Post-Training: Underwent a specialized long-COT RL Post-Training process, demonstrating expected behavior with simultaneous increases in response length and reward scores.
Robustness: Performs well on the GPQA benchmark without specific training, indicating strong generalization.
Data Decontamination: Features thorough data decontamination processes, including exact and N-gram matching, to ensure benchmark integrity.

Good For

Advanced Mathematical Problem Solving: Ideal for applications requiring high-accuracy mathematical reasoning and complex problem-solving.
Research in RL for LLMs: Provides a valuable open-source example of successful RL application on pre-fine-tuned models.
Benchmarking and Development: A strong candidate for evaluating and developing new techniques in mathematical AI.

Overview

Light-R1-14B-DS: State-of-the-Art 14B Math Model

Key Capabilities & Performance

Good For

Full Model Card (README)