Name: Kazuki1450/Qwen3-1.7B-Base_csum_6_10_tok_Fourth_1p0_0p0_1p0_grpo_42_rule API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Kazuki1450

Overview

This model, Kazuki1450/Qwen3-1.7B-Base_csum_6_10_tok_Fourth_1p0_0p0_1p0_grpo_42_rule, is a specialized fine-tuned version of the Qwen3-1.7B-Base architecture, featuring approximately 2 billion parameters and a substantial context length of 40960 tokens. It was developed by Kazuki1450 and built upon the robust Qwen foundation.

Key Capabilities

Enhanced Reasoning: The model's primary differentiator is its training with the GRPO method, a technique introduced in the DeepSeekMath paper. This method is specifically designed to push the limits of mathematical reasoning in open language models.
Fine-tuned Performance: It leverages the TRL (Transformer Reinforcement Learning) framework for its fine-tuning process, indicating a focus on optimizing specific task performance.

Good For

Mathematical Reasoning Tasks: Given its training with the GRPO method, this model is particularly well-suited for applications requiring strong mathematical problem-solving and reasoning abilities.
Research and Experimentation: Developers interested in exploring the impact of GRPO on smaller, efficient models will find this a valuable resource.

Limitations

As a base model, it may require further instruction-tuning for general conversational or instruction-following tasks, though its specialized training suggests proficiency in its target domain.

Overview

Overview

Key Capabilities

Good For

Limitations

Full Model Card (README)