Name: cjiao/golden-goose-qwen2.5-1.5b-instruct-greedy-bottom API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Overview

cjiao/golden-goose-qwen2.5-1.5b-instruct-greedy-bottom is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. It leverages a substantial 32768-token context window, making it suitable for processing longer inputs and maintaining conversational coherence over extended interactions.

Key Differentiator: GRPO Training

This model's primary distinction lies in its training methodology. It was fine-tuned using GRPO (Greedy-Bottom Reinforcement Learning), a method introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models". This specialized training aims to significantly enhance the model's ability to perform complex mathematical reasoning tasks.

Training Framework

The fine-tuning process utilized the TRL (Transformers Reinforcement Learning) library, indicating a focus on reinforcement learning from human feedback or similar techniques to refine its instruction-following and response generation. Specific framework versions used include TRL 1.1.0, Transformers 4.57.6, Pytorch 2.10.0, Datasets 4.8.4, and Tokenizers 0.22.2.

Potential Use Cases

Given its GRPO training, this model is particularly well-suited for applications requiring:

Mathematical problem-solving: Tasks involving arithmetic, algebra, geometry, or more advanced mathematical concepts.
Logical reasoning: Scenarios where structured thought and step-by-step deduction are crucial.
Instruction following: Benefiting from its instruction-tuned base and further refinement.

Developers can integrate this model using the Hugging Face pipeline for text generation, as demonstrated in the quick start example.

Overview

Overview

Key Differentiator: GRPO Training

Training Framework

Potential Use Cases

Full Model Card (README)