maheshrawat18/Qwen3-4B-2507-sft1

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 28, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The maheshrawat18/Qwen3-4B-2507-sft1 is a 4 billion parameter Qwen3 model, fine-tuned by maheshrawat18 from unsloth/Qwen3-4B-Thinking-2507. This model was trained using Unsloth and Huggingface's TRL library, enabling 2x faster training. It is designed for general language tasks, leveraging its efficient fine-tuning process.

Loading preview...

Model Overview

The maheshrawat18/Qwen3-4B-2507-sft1 is a 4 billion parameter language model based on the Qwen3 architecture. It was fine-tuned by maheshrawat18, building upon the unsloth/Qwen3-4B-Thinking-2507 base model.

Key Characteristics

  • Architecture: Qwen3
  • Parameter Count: 4 billion parameters
  • Training Efficiency: This model was fine-tuned with a focus on speed, utilizing Unsloth and Huggingface's TRL library to achieve 2x faster training compared to standard methods.
  • License: Distributed under the Apache-2.0 license.

Potential Use Cases

This model is suitable for a variety of general language generation and understanding tasks where a 4 billion parameter model offers a balance between performance and computational efficiency. Its optimized training process suggests it could be a good candidate for applications requiring rapid iteration or deployment on resource-constrained environments.