Name: cjiao/goldengoose-gumbel_gradsim_tau0.10-25grp API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: cjiao

Model Overview

This model, cjiao/goldengoose-gumbel_gradsim_tau0.10-25grp, is a specialized 1.5 billion parameter instruction-tuned language model. It is a fine-tuned variant of the Qwen/Qwen2.5-1.5B-Instruct base model, developed by cjiao. A key differentiator is its training methodology: it leverages the GRPO (Gumbel-softmax Reinforcement Learning with Policy Optimization) method, as introduced in the research paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models". This training approach aims to significantly improve the model's proficiency in mathematical reasoning and complex problem-solving.

Key Capabilities

Enhanced Mathematical Reasoning: Specifically trained with the GRPO method to excel in tasks requiring logical deduction and mathematical problem-solving.
Instruction Following: Inherits strong instruction-following capabilities from its Qwen2.5-1.5B-Instruct base.
Efficient Size: At 1.5 billion parameters, it offers a balance between performance and computational efficiency for specialized tasks.

Good For

Applications requiring robust mathematical reasoning.
Tasks involving logical problem-solving and structured output.
Developers looking for a compact model with specialized reasoning capabilities.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)