Name: cjiao/goldengoose-gumbel_combined_indoc_tau0.50-25grp API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

The cjiao/goldengoose-gumbel_combined_indoc_tau0.50-25grp is a 1.5 billion parameter language model, fine-tuned from the Qwen/Qwen2.5-1.5B-Instruct base model. It was developed by cjiao using the TRL framework.

Key Capabilities

Enhanced Mathematical Reasoning: This model incorporates the GRPO (Gumbel-softmax Reinforcement Learning with Policy Optimization) training method, as introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models". This method is specifically designed to improve a model's ability to handle complex mathematical and logical reasoning tasks.
Instruction Following: As a fine-tuned version of an instruction-tuned model, it is capable of following user instructions effectively.
Context Length: Benefits from the Qwen2.5 base model's substantial 32768 token context window, allowing for processing longer inputs and maintaining coherence over extended interactions.

Training Details

The model was trained using the TRL (Transformer Reinforcement Learning) library, with specific framework versions including TRL 0.19.1, Transformers 4.57.6, Pytorch 2.5.1, Datasets 4.8.4, and Tokenizers 0.22.2.

Good for

Applications requiring strong mathematical problem-solving.
Tasks where logical reasoning is critical.
Instruction-following scenarios that can leverage its enhanced reasoning capabilities.

Overview

Model Overview

Key Capabilities

Training Details

Good for

Full Model Card (README)