maheshrawat18/Qwen3-4B-2507-sft-new

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Apr 30, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The maheshrawat18/Qwen3-4B-2507-sft-new is a 4 billion parameter Qwen3-based language model developed by maheshrawat18. This model was fine-tuned using Unsloth and Huggingface's TRL library, enabling 2x faster training. It is designed for general language tasks, leveraging its efficient fine-tuning process to provide a capable model within the Qwen3 architecture.

Loading preview...

Model Overview

The maheshrawat18/Qwen3-4B-2507-sft-new is a 4 billion parameter language model based on the Qwen3 architecture. Developed by maheshrawat18, this model has been fine-tuned from unsloth/Qwen3-4B-Thinking-2507.

Key Characteristics

  • Efficient Fine-tuning: This model was fine-tuned significantly faster (2x) using the Unsloth library in conjunction with Huggingface's TRL library. This indicates an optimized training process for improved efficiency.
  • Qwen3 Architecture: Built upon the Qwen3 foundation, it inherits the general capabilities and performance characteristics of this model family.
  • License: The model is released under the Apache-2.0 license, allowing for broad usage and distribution.

Potential Use Cases

Given its efficient fine-tuning and Qwen3 base, this model is suitable for:

  • General text generation and understanding tasks.
  • Applications requiring a moderately sized language model (4B parameters) with optimized training.
  • Experimentation with Qwen3 models where faster fine-tuning is a priority.