Name: cjiao/goldengoose-corr-v4-0.25-200 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

The cjiao/goldengoose-corr-v4-0.25-200 is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. It was developed by cjiao and trained using the TRL framework.

Key Differentiator

This model's primary distinction lies in its training methodology. It incorporates GRPO (Gradient-based Reward Policy Optimization), a method introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models". This indicates a specific optimization for enhancing mathematical reasoning abilities.

Training Details

The model was fine-tuned using the TRL library. The GRPO method, central to its training, aims to improve performance in complex reasoning tasks, particularly those involving mathematical problem-solving. It supports a substantial context length of 32768 tokens.

Use Cases

Given its GRPO-enhanced training, this model is particularly well-suited for applications requiring:

Mathematical reasoning: Solving complex math problems or generating logical steps for mathematical proofs.
Instruction following: Responding accurately to detailed instructions, especially in analytical contexts.
General language generation: While specialized, it retains the general capabilities of its Qwen2.5-1.5B-Instruct base for various text generation tasks.

Overview

Model Overview

Key Differentiator

Training Details

Use Cases

Full Model Card (README)