Name: cjiao/goldengoose-gumbel_tau0.10-25grp API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

The cjiao/goldengoose-gumbel_tau0.10-25grp is a 1.5 billion parameter instruction-tuned language model, building upon the Qwen/Qwen2.5-1.5B-Instruct architecture. It was fine-tuned using the TRL framework and incorporates the GRPO (Gumbel-softmax Relaxed Policy Optimization) method, as introduced in the DeepSeekMath paper, to enhance its reasoning capabilities.

Key Capabilities

Enhanced Mathematical Reasoning: Leverages the GRPO training method, specifically designed to improve performance on mathematical and logical reasoning tasks.
Instruction Following: As an instruction-tuned model, it is capable of understanding and executing a wide range of user prompts.
Context Length: Supports a substantial context window of 32,768 tokens, allowing for processing longer inputs and maintaining conversational coherence.

Training Details

The model's training procedure involved the GRPO method, which is detailed in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). This approach aims to push the boundaries of mathematical reasoning in open language models.

Good For

Applications requiring strong mathematical problem-solving.
Tasks that benefit from advanced logical reasoning.
Scenarios where a compact yet capable model with good instruction-following is needed.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)