Name: SomayJalan/OpenRS-GRPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SomayJalan

Model Overview

SomayJalan/OpenRS-GRPO is a 1.5 billion parameter language model derived from deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B. It has been specifically fine-tuned using the GRPO (Gradient-based Reward Policy Optimization) method, as introduced in the DeepSeekMath research, on the knoveleng/open-rs dataset. This training approach focuses on enhancing the model's capabilities in mathematical reasoning and complex problem-solving.

Key Capabilities

Mathematical Reasoning: Leverages the GRPO training method to improve performance on tasks requiring logical and mathematical deduction.
Fine-tuned Performance: Built upon a robust base model and further optimized for specific reasoning challenges.
Context Length: Supports a substantial context window of 32768 tokens, allowing for processing longer inputs and complex problem descriptions.

Good For

Applications requiring strong mathematical and logical reasoning.
Tasks involving complex problem-solving where detailed understanding and deduction are crucial.
Research and development in advanced language model fine-tuning techniques, particularly GRPO.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)