Name: Harsha901/Qwen3_4B-GRPO-Math API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Harsha901

Harsha901/Qwen3_4B-GRPO-Math Overview

This model is a 4 billion parameter Qwen3 variant, developed by Harsha901 and fine-tuned from the unsloth/Qwen3-4B-Base model. It was specifically trained to enhance its capabilities in mathematical reasoning and problem-solving. The fine-tuning process utilized Unsloth and Huggingface's TRL library, which enabled a 2x faster training speed.

Key Capabilities

Mathematical Reasoning: Optimized for handling mathematical tasks and complex numerical problems.
Efficient Training: Benefits from Unsloth's accelerated training techniques.
Extended Context: Features a 40960 token context length, allowing for processing and understanding lengthy mathematical problems or detailed instructions.

Good For

Applications requiring strong mathematical problem-solving.
Tasks that benefit from a large context window for detailed input or output.
Developers looking for a Qwen3-based model with enhanced numerical capabilities.

Overview

Harsha901/Qwen3_4B-GRPO-Math Overview

Key Capabilities

Good For

Full Model Card (README)