cjiao/goldengoose-ld_match_hd_range-25grp

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:May 28, 2026Architecture:Transformer Warm

The cjiao/goldengoose-ld_match_hd_range-25grp model is a 1.5 billion parameter instruction-tuned language model, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. Developed by cjiao, this model was trained using the GRPO method, which is specifically designed to enhance mathematical reasoning capabilities. It is optimized for tasks requiring robust logical and mathematical problem-solving, leveraging a 32K context length.

Loading preview...

Model Overview

The cjiao/goldengoose-ld_match_hd_range-25grp is a 1.5 billion parameter language model, fine-tuned from the Qwen2.5-1.5B-Instruct base model. This model was developed by cjiao and specifically trained using the GRPO (Gradient-based Reasoning Policy Optimization) method, as introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300).

Key Capabilities

  • Enhanced Mathematical Reasoning: The primary differentiator of this model is its training with GRPO, which aims to significantly improve its ability to handle mathematical and logical reasoning tasks.
  • Instruction Following: As a fine-tuned instruction model, it is designed to follow user prompts and generate relevant responses effectively.
  • Context Length: Supports a substantial context window of 32,768 tokens, allowing for processing longer inputs and maintaining coherence over extended conversations or documents.

Training Details

The model was fine-tuned using the TRL (Transformer Reinforcement Learning) framework, version 0.19.1. The GRPO method focuses on optimizing the model's reasoning policy, making it particularly suitable for applications where accurate logical deduction and mathematical problem-solving are critical.

Good For

  • Applications requiring strong mathematical problem-solving.
  • Tasks involving logical reasoning and complex instruction following.
  • Use cases where a smaller, efficient model with specialized reasoning capabilities is preferred over larger, more general-purpose models.