Name: cjiao/golden-goose-qwen2.5-1.5b-instruct-greedy-top API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

The cjiao/golden-goose-qwen2.5-1.5b-instruct-greedy-top is a 1.5 billion parameter instruction-tuned language model, developed by cjiao. It is a fine-tuned variant of the Qwen/Qwen2.5-1.5B-Instruct base model, leveraging the TRL framework for its training process.

Key Differentiator: GRPO Training

A significant aspect of this model is its training methodology. It was fine-tuned using GRPO (Greedy Policy Optimization), a method highlighted in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). This suggests an optimization for tasks that benefit from advanced reasoning capabilities, potentially including mathematical or logical problem-solving.

Technical Specifications

Base Model: Qwen2.5-1.5B-Instruct
Parameter Count: 1.5 billion
Context Length: 32768 tokens
Training Framework: TRL (Transformers Reinforcement Learning)

Potential Use Cases

Given its GRPO-based fine-tuning, this model could be particularly effective for:

Reasoning-intensive tasks: Applications requiring logical deduction or problem-solving.
Instruction following: Benefiting from its instruction-tuned base.
Long-context applications: Utilizing its substantial 32K token context window.

Overview

Model Overview

Key Differentiator: GRPO Training

Technical Specifications

Potential Use Cases

Full Model Card (README)