cjiao/goldengoose-top25_gmrel-25grp

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:May 17, 2026Architecture:Transformer Warm

The cjiao/goldengoose-top25_gmrel-25grp model is a 1.5 billion parameter language model developed by cjiao, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. It utilizes the GRPO method, as introduced in the DeepSeekMath paper, to enhance its reasoning capabilities. With a context length of 32768 tokens, this model is optimized for tasks requiring advanced mathematical and general reasoning.

Loading preview...

Model Overview

The cjiao/goldengoose-top25_gmrel-25grp model is a specialized language model developed by cjiao, built upon the Qwen/Qwen2.5-1.5B-Instruct architecture. This 1.5 billion parameter model features a substantial context length of 32768 tokens, making it suitable for processing longer inputs and complex queries.

Key Capabilities and Training

This model's primary distinction lies in its training methodology. It has been fine-tuned using the GRPO (General Reasoning Policy Optimization) method. GRPO is a technique highlighted in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). This indicates a focus on improving the model's ability to handle intricate reasoning tasks, particularly those involving mathematical or logical deduction.

Technical Details

  • Base Model: Qwen/Qwen2.5-1.5B-Instruct
  • Parameter Count: 1.5 Billion
  • Context Length: 32768 tokens
  • Training Frameworks: TRL (version 0.19.1), Transformers (version 4.57.6), PyTorch (version 2.5.1), Datasets (version 4.8.4), Tokenizers (version 0.22.2)

When to Use This Model

Given its GRPO-enhanced training, this model is particularly well-suited for applications requiring:

  • Mathematical Reasoning: Solving problems that involve numerical logic, equations, or proofs.
  • Complex Problem Solving: Tasks that benefit from structured, step-by-step reasoning.
  • General Reasoning: Scenarios where logical inference and analytical thinking are crucial.

Developers can quickly integrate and experiment with the model using the provided Hugging Face pipeline for text generation.