zgao3186/qwen25math7b-one-shot-em
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:May 29, 2025License:mitArchitecture:Transformer0.0K Open Weights Cold

The zgao3186/qwen25math7b-one-shot-em model is a 7.6 billion parameter language model based on the Qwen2.5-Math-7B architecture, developed by Zitian Gao and his team. It explores a novel post-training paradigm called One-shot Entropy Minimization, demonstrating significant performance improvements in mathematical reasoning tasks with minimal data and optimization steps. This model is specifically optimized for enhancing mathematical capabilities through an efficient entropy minimization technique.

Loading preview...