cjiao/goldengoose-top25_gmrel_polar-25grp

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:May 17, 2026Architecture:Transformer Warm

The cjiao/goldengoose-top25_gmrel_polar-25grp model is a 1.5 billion parameter instruction-tuned language model, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. It was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. With a context length of 32768 tokens, this model is optimized for tasks requiring robust reasoning, particularly in mathematical contexts.

Loading preview...

Model Overview

The cjiao/goldengoose-top25_gmrel_polar-25grp is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. It leverages a substantial context length of 32768 tokens, making it suitable for processing longer inputs and generating comprehensive responses.

Key Capabilities

  • Enhanced Mathematical Reasoning: This model was specifically trained using the GRPO (Guided Reasoning Policy Optimization) method, as introduced in the DeepSeekMath paper. This training approach aims to significantly improve the model's ability to handle and solve mathematical reasoning problems.
  • Instruction Following: As a fine-tuned version of an instruction-tuned model, it is designed to follow user instructions effectively, generating relevant and coherent text based on prompts.
  • Efficient Performance: With 1.5 billion parameters, it offers a balance between performance and computational efficiency, making it accessible for various applications.

Training Details

The model's unique strength in mathematical reasoning stems from its training with GRPO, a technique detailed in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). The fine-tuning process was conducted using the TRL library (version 0.19.1), with Transformers 4.57.6 and PyTorch 2.5.1.

Use Cases

This model is particularly well-suited for applications requiring:

  • Mathematical problem-solving and explanation generation.
  • Reasoning tasks where logical deduction is crucial.
  • General instruction-following for text generation in a compact model size.