Name: ertghiu256/Qwen3-4B-Thinking-2507-Hermes-3 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: ertghiu256

Model Overview

ertghiu256/Qwen3-4B-Thinking-2507-Hermes-3 is a 4 billion parameter model built on the Qwen3 architecture. It has been fine-tuned using the Hermes 3 dataset, with a focus on enhancing its core capabilities in reasoning and instruction adherence. The model supports an extended context length of 40960 tokens, allowing for processing and generating longer, more complex texts.

Key Capabilities

Enhanced Reasoning: The model is specifically trained to maintain and improve its reasoning abilities.
Better Instruction Following: It demonstrates improved performance in understanding and executing user instructions.
Extended Context: With a 40960-token context window, it can handle detailed and lengthy prompts.

Training Details

The model was trained using Unsloth, undergoing 60 steps with a learning rate of 3e-5. The training incorporated 28,000 samples from the Hermes 3 dataset, which contributed to its specialized focus on reasoning and instruction following.

Recommended Usage

Optimal performance is suggested with specific inference parameters:

Temperature: 0.6
Top_P: 20
Top_K: 0.95

Overview

Model Overview

Key Capabilities

Training Details

Recommended Usage

Full Model Card (README)