cjiao/goldengoose-corr-v4-random-200

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:May 3, 2026Architecture:Transformer Warm

The cjiao/goldengoose-corr-v4-random-200 is a 1.5 billion parameter instruction-tuned causal language model, fine-tuned by cjiao from the Qwen/Qwen2.5-1.5B-Instruct base model. Utilizing a 32768 token context length, this model was trained using the GRPO method, which is designed to enhance mathematical reasoning capabilities. It is optimized for tasks requiring advanced reasoning, particularly in mathematical contexts, making it suitable for applications where precise logical inference is crucial.

Loading preview...

Model Overview

The cjiao/goldengoose-corr-v4-random-200 is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. Developed by cjiao, this model leverages a substantial 32768 token context window, making it capable of processing longer inputs and maintaining coherence over extended interactions.

Key Differentiator: GRPO Training

A significant aspect of this model is its training methodology. It was fine-tuned using GRPO (Gradient-based Reasoning Policy Optimization), a method introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models." This specialized training approach aims to enhance the model's capabilities in:

  • Mathematical Reasoning: Improving its ability to understand and solve complex mathematical problems.
  • Logical Inference: Strengthening its capacity for structured and accurate reasoning.

Technical Details

The model's training utilized the TRL framework (version 0.19.1) alongside Transformers (4.57.6), Pytorch (2.5.1), Datasets (4.8.4), and Tokenizers (0.22.2).

Use Cases

Given its GRPO-enhanced training, this model is particularly well-suited for applications requiring robust mathematical and logical reasoning, such as:

  • Solving mathematical word problems.
  • Generating logical explanations or proofs.
  • Tasks where precise, step-by-step reasoning is critical.