TMLR-Group-HF/GT-Qwen3-1.7B-Base-MATH

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:Aug 14, 2025License:mitArchitecture:Transformer Open Weights Warm

TMLR-Group-HF/GT-Qwen3-1.7B-Base-MATH is a 1.7 billion parameter Qwen3-based language model developed by TMLR-Group-HF, specifically trained using the GRPO Ground Truth method on a mathematical dataset. With a 40960 token context length, this model is optimized for reasoning and mathematical tasks. Its specialized training makes it particularly suitable for applications requiring robust mathematical problem-solving capabilities.

Loading preview...

Model Overview

TMLR-Group-HF/GT-Qwen3-1.7B-Base-MATH is a 1.7 billion parameter model built upon the Qwen3 architecture. Developed by TMLR-Group-HF, this model distinguishes itself through its specialized training regimen. It leverages the GRPO Ground Truth method, with a primary focus on a dedicated mathematical training set.

Key Capabilities

  • Mathematical Reasoning: Optimized for tasks requiring mathematical understanding and problem-solving.
  • Specialized Training: Utilizes the GRPO Ground Truth method for enhanced performance in its target domain.
  • Qwen3 Architecture: Benefits from the foundational capabilities of the Qwen3 model family.

Good For

  • Mathematical Applications: Ideal for use cases involving complex calculations, proofs, or mathematical text generation.
  • Research in Mathematical LLMs: Provides a base for further exploration and fine-tuning in the domain of AI for mathematics.

For those interested in the underlying methodology, particularly Co-Reward, further details and resources are available on the TMLR-Group GitHub repository.