Name: cjiao/goldengoose-gumbel_gmrel_tau2.00-25grp API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

The cjiao/goldengoose-gumbel_gmrel_tau2.00-25grp is a 1.5 billion parameter instruction-tuned language model, building upon the base architecture of Qwen/Qwen2.5-1.5B-Instruct. It features a substantial context length of 32768 tokens, making it suitable for processing longer inputs and generating comprehensive responses.

Key Differentiator: GRPO Training

What sets this model apart is its training methodology. It was fine-tuned using GRPO (Gumbel-softmax Reinforcement Learning with Policy Optimization), a technique introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models." This specialized training aims to significantly improve the model's capabilities in mathematical reasoning and complex problem-solving.

Potential Use Cases

Mathematical Problem Solving: Ideal for applications requiring logical deduction and mathematical computation.
Complex Reasoning Tasks: Suitable for scenarios where structured, step-by-step reasoning is crucial.
Instruction Following: Benefits from its instruction-tuned base, making it responsive to detailed prompts.

Training Details

The model was trained using the TRL framework (version 0.19.1) and leverages PyTorch (version 2.5.1). The training process is publicly logged and can be visualized via Weights & Biases.

Overview

Model Overview

Key Differentiator: GRPO Training

Potential Use Cases

Training Details

Full Model Card (README)