Name: tgtgeo/gensyn-checkpoints-jumping_gentle_ant API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tgtgeo

Model Overview

The tgtgeo/gensyn-checkpoints-jumping_gentle_ant model is a 0.5 billion parameter language model, derived from the Gensyn/Qwen2.5-1.5B-Instruct base model. It has been fine-tuned using the TRL (Transformer Reinforcement Learning) framework, specifically leveraging the GRPO (Gradient-based Reward Policy Optimization) method.

Key Capabilities

Enhanced Mathematical Reasoning: The integration of the GRPO method, as detailed in the DeepSeekMath paper, suggests a focus on improving the model's ability to handle complex mathematical problems and reasoning tasks.
Large Context Window: With a context length of 131072 tokens, the model can process and generate text based on extensive input, which is beneficial for tasks requiring long-range dependencies or detailed contextual understanding.
Instruction Following: As a fine-tuned instruction model, it is designed to respond effectively to user prompts and instructions.

Training Details

The model's training procedure utilized GRPO, a technique introduced in the paper "DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models" (arXiv:2402.03300). The training environment included TRL 0.15.2, Transformers 4.51.3, Pytorch 2.6.0, Datasets 3.6.0, and Tokenizers 0.21.1.

Good For

Applications requiring strong mathematical reasoning.
Tasks benefiting from a very large context window.
Instruction-following scenarios where precise and context-aware responses are needed.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)