Name: amphora/qwen3-4b-plz API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: amphora

Model Overview

The amphora/qwen3-4b-plz model is a 4 billion parameter language model derived from the Qwen3-4B-Thinking-2507 base model. It has undergone further fine-tuning on the combined_reasoning_sft_lt100k dataset, indicating a specialization in reasoning-focused tasks.

Key Characteristics

Base Model: Qwen/Qwen3-4B-Thinking-2507.
Parameter Count: 4 billion parameters.
Context Length: Supports a context window of 32,768 tokens.
Specialization: Fine-tuned for improved performance on reasoning tasks, leveraging a dedicated reasoning dataset.

Training Details

The model was trained with specific hyperparameters including a learning rate of 4e-05, a total batch size of 128 (achieved with a train batch size of 2 and gradient accumulation steps of 8), and 3 epochs. The optimizer used was AdamW with cosine learning rate scheduling and a warmup ratio of 0.1. The training utilized 8 devices.

Intended Use Cases

This model is suitable for applications where robust reasoning and logical inference are critical. Its fine-tuning on a reasoning-specific dataset suggests enhanced capabilities in understanding and generating responses that require analytical thought.

Overview

Model Overview

Key Characteristics

Training Details

Intended Use Cases

Full Model Card (README)