cjiao/goldengoose-corr-v2-0.80-100

TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Apr 25, 2026Architecture:Transformer Cold

cjiao/goldengoose-corr-v2-0.80-100 is a 1.5 billion parameter instruction-tuned causal language model, fine-tuned by cjiao from Qwen/Qwen2.5-1.5B-Instruct. It utilizes the GRPO training method, as introduced in the DeepSeekMath paper, to enhance its capabilities. With a context length of 32768 tokens, this model is primarily optimized for tasks requiring improved reasoning and response coherence, particularly in conversational or question-answering scenarios.

Loading preview...

Overview

cjiao/goldengoose-corr-v2-0.80-100 is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. This model was developed by cjiao using the TRL framework and incorporates the GRPO (Gradient-based Reward Policy Optimization) training method. GRPO is a technique highlighted in the "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" paper, suggesting an emphasis on improving reasoning capabilities.

Key Capabilities

  • Enhanced Reasoning: Leverages the GRPO training method, which is associated with advancements in mathematical reasoning in larger models, to potentially improve logical coherence in responses.
  • Instruction Following: Built upon an instruction-tuned base model, it is designed to follow user prompts effectively.
  • Conversational AI: Suitable for generating coherent and contextually relevant text in interactive scenarios, such as question answering or dialogue generation.

Good for

  • General Text Generation: Creating diverse text outputs based on given prompts.
  • Question Answering Systems: Providing detailed and reasoned answers to complex questions.
  • Exploratory AI Development: Researchers and developers looking to experiment with models fine-tuned using advanced reinforcement learning techniques like GRPO, especially in the 1.5B parameter class.