Name: cjiao/goldengoose-ld_match_hd_range-25grp API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

The cjiao/goldengoose-ld_match_hd_range-25grp is a 1.5 billion parameter language model, fine-tuned from the Qwen2.5-1.5B-Instruct base model. This model was developed by cjiao and specifically trained using the GRPO (Gradient-based Reasoning Policy Optimization) method, as introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300).

Key Capabilities

Enhanced Mathematical Reasoning: The primary differentiator of this model is its training with GRPO, which aims to significantly improve its ability to handle mathematical and logical reasoning tasks.
Instruction Following: As a fine-tuned instruction model, it is designed to follow user prompts and generate relevant responses effectively.
Context Length: Supports a substantial context window of 32,768 tokens, allowing for processing longer inputs and maintaining coherence over extended conversations or documents.

Training Details

The model was fine-tuned using the TRL (Transformer Reinforcement Learning) framework, version 0.19.1. The GRPO method focuses on optimizing the model's reasoning policy, making it particularly suitable for applications where accurate logical deduction and mathematical problem-solving are critical.

Good For

Applications requiring strong mathematical problem-solving.
Tasks involving logical reasoning and complex instruction following.
Use cases where a smaller, efficient model with specialized reasoning capabilities is preferred over larger, more general-purpose models.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)