Name: cjiao/goldengoose-gumbel-1.00-100 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

The cjiao/goldengoose-gumbel-1.00-100 is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. It leverages a substantial context length of 32768 tokens, making it suitable for processing longer inputs and maintaining conversational coherence over extended interactions.

Key Training Details

This model was trained using the GRPO (Gumbel-Softmax Policy Optimization) method, as introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). The application of GRPO suggests a focus on improving the model's ability to handle complex reasoning tasks, particularly in mathematical domains. The fine-tuning process was conducted using the TRL (Transformer Reinforcement Learning) framework.

Potential Use Cases

Given its foundation in Qwen2.5-1.5B-Instruct and specialized training with GRPO, this model is likely well-suited for:

Mathematical Reasoning: Tasks involving problem-solving, logical deduction, and quantitative analysis.
Instruction Following: Generating responses based on specific user instructions, benefiting from its instruction-tuned base.
Long Context Understanding: Applications requiring the model to process and synthesize information from extensive textual inputs due to its 32768-token context window.

Overview

Model Overview

Key Training Details

Potential Use Cases

Full Model Card (README)