ganglii/DisCO-1.5B-logL is a 1.5 billion parameter language model fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B by Gang Li, Ming Lin, Tomer Galanti, Zhengzhong Tu, and Tianbao Yang. It was optimized using the DisCO framework with a Log-Likelihood score function on the DeepScaleR-Preview-Dataset. This model specializes in reasoning tasks, demonstrating improved performance on benchmarks like AIME, MATH, and AMC compared to its base model and other fine-tuning methods.
Loading preview...
DisCO-1.5B-logL: Reasoning-Optimized 1.5B Model
This model is a 1.5 billion parameter language model developed by Gang Li, Ming Lin, Tomer Galanti, Zhengzhong Tu, and Tianbao Yang. It is a fine-tuned version of the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B base model, specifically optimized using the DisCO (Discriminative Constrained Optimization) framework with a Log-Likelihood (Log-L) score function.
Key Capabilities & Performance
- Enhanced Reasoning: Fine-tuned on the
agentica-org/DeepScaleR-Preview-Dataset, this model shows significant improvements in reasoning benchmarks. - Benchmark Superiority: Achieves an average score of 0.533 across AIME 2024, AIME 2025, MATH 500, AMC 2023, Minerva, and O-Bench, outperforming the base
DeepSeek-R1-Distill-Qwen-1.5B(0.451) and other fine-tuning methods like GRPO and DAPO. - Long Context Support: The base model supports a context length of 131072 tokens, and the fine-tuning was conducted with a maximum response length of 8k tokens for both training and testing.
When to Use This Model
- Reasoning-intensive applications: Ideal for tasks requiring strong logical inference and problem-solving, particularly in mathematical and scientific domains.
- Benchmarking and Research: Useful for researchers exploring discriminative constrained optimization methods and their impact on small language models.
- Resource-constrained environments: As a 1.5B parameter model, it offers competitive reasoning capabilities while being more efficient than larger models.