cjiao/goldengoose-corr-v2-0.25-100

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 25, 2026Architecture:Transformer Cold

The cjiao/goldengoose-corr-v2-0.25-100 model is a 1.5 billion parameter language model, fine-tuned from Qwen/Qwen2.5-1.5B-Instruct. Developed by cjiao, this model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. It is optimized for tasks requiring advanced reasoning, building upon the robust foundation of the Qwen2.5 architecture.

Loading preview...

Model Overview

The cjiao/goldengoose-corr-v2-0.25-100 is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. This model was developed by cjiao and leverages the TRL library for its training process.

Key Capabilities

  • Enhanced Reasoning: The model's training incorporated the GRPO (Gradient-based Reward Policy Optimization) method, as introduced in the DeepSeekMath paper. This technique is specifically designed to push the limits of mathematical and general reasoning in open language models.
  • Instruction Following: Building on the Qwen2.5-1.5B-Instruct foundation, it retains strong instruction-following capabilities, making it suitable for various prompt-based tasks.
  • Efficient Performance: With 1.5 billion parameters and a context length of 32768 tokens, it offers a balance between performance and computational efficiency.

Training Details

The model was fine-tuned using the TRL library, with specific framework versions including TRL 0.19.1, Transformers 4.57.6, Pytorch 2.5.1, Datasets 4.8.4, and Tokenizers 0.22.2. The application of the GRPO method suggests a focus on improving logical and mathematical problem-solving abilities.

Good For

  • Applications requiring mathematical reasoning and complex problem-solving.
  • Tasks where instruction following and coherent text generation are crucial.
  • Scenarios needing a relatively compact yet capable language model for reasoning-intensive workloads.