Name: maheshrawat18/Qwen3-4B-2507-sft-merged-thinking-final API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: maheshrawat18

Overview

maheshrawat18/Qwen3-4B-2507-sft-merged-thinking-final is a 4 billion parameter language model based on the Qwen3 architecture, developed by maheshrawat18. This model was fine-tuned from the unsloth/Qwen3-4B-Thinking-2507 base model.

Key Capabilities

Efficient Training: Leverages Unsloth and Huggingface's TRL library, resulting in a 2x speedup during the training process.
Substantial Context Window: Supports a context length of 32768 tokens, allowing for processing and generating longer sequences of text.
Qwen3 Architecture: Benefits from the underlying Qwen3 architecture, providing a strong foundation for various natural language processing tasks.

Good for

Applications requiring a 4 billion parameter model with an extended context length.
Scenarios where efficient fine-tuning is a priority.
General text generation and understanding tasks building upon the Qwen3 family.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)