Name: HillPhelmuth/Qwen3-4B-GRPO-MathsFT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: HillPhelmuth

Model Overview

This model, developed by HillPhelmuth, is a Qwen3-based language model with 2 billion parameters, fine-tuned from the unsloth/Qwen3-1.7B-Base architecture. It was trained using a combination of Unsloth and Huggingface's TRL library, which facilitated a 2x speedup in the training process.

Key Characteristics

Base Model: Fine-tuned from unsloth/Qwen3-1.7B-Base.
Training Efficiency: Leverages Unsloth and Huggingface TRL for accelerated training.
License: Distributed under the Apache-2.0 license.

Use Cases

While the specific primary use case or unique capabilities beyond its training methodology are not detailed in the provided information, its foundation on the Qwen3 architecture suggests suitability for general language tasks. Developers interested in models optimized for training speed and built upon the Qwen3 family may find this model particularly relevant.

Overview

Model Overview

Key Characteristics

Use Cases

Full Model Card (README)