Name: cjiao/goldengoose-low_div_rand-25grp API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

cjiao/goldengoose-low_div_rand-25grp is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. This model distinguishes itself through its specialized training procedure, which utilizes the TRL (Transformer Reinforcement Learning) library.

Key Differentiator: GRPO Training

A core aspect of this model's development is its training with GRPO (Gradient-based Reward Policy Optimization), a method introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). This indicates a focus on enhancing the model's capabilities in mathematical reasoning and problem-solving.

Technical Details

Base Model: Qwen/Qwen2.5-1.5B-Instruct
Training Framework: TRL (version 0.19.1)
Parameter Count: 1.5 billion
Context Length: 32768 tokens

Intended Use Cases

Given its GRPO-based training, this model is particularly well-suited for applications requiring:

Mathematical Reasoning: Solving complex math problems and understanding mathematical concepts.
Logical Deduction: Tasks that benefit from structured reasoning and problem-solving approaches.
Instruction Following: General instruction-tuned capabilities inherited from its base model, with an emphasis on analytical tasks.

Overview

Model Overview

Key Differentiator: GRPO Training

Technical Details

Intended Use Cases

Full Model Card (README)