cjiao/goldengoose-corr-v2-0.50-100

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 25, 2026Architecture:Transformer Cold

The cjiao/goldengoose-corr-v2-0.50-100 model is a 1.5 billion parameter instruction-tuned language model, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. It was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. This model is optimized for tasks requiring improved reasoning, particularly in mathematical contexts, and supports a 32K context length.

Loading preview...

Model Overview

cjiao/goldengoose-corr-v2-0.50-100 is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. It leverages a 32,768 token context length, making it suitable for processing longer inputs.

Key Capabilities

  • Enhanced Reasoning: This model was specifically trained using the GRPO (Gradient-based Reward Policy Optimization) method, as introduced in the DeepSeekMath paper. This training approach aims to significantly improve the model's mathematical reasoning abilities.
  • Instruction Following: As a fine-tuned instruction model, it is designed to follow user prompts effectively, building upon the capabilities of its Qwen2.5-1.5B-Instruct base.
  • Efficient Fine-tuning: The model's training utilized the TRL (Transformer Reinforcement Learning) library, indicating a focus on efficient and effective fine-tuning techniques.

When to Use This Model

This model is particularly well-suited for applications where robust mathematical reasoning and accurate instruction following are critical, especially within the constraints of a 1.5 billion parameter model. Its GRPO training makes it a strong candidate for tasks involving numerical problems, logical deductions, and other reasoning-intensive scenarios.