yusei926/qwen3-4b-sft-merged-v2-20260207-1148

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 7, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The yusei926/qwen3-4b-sft-merged-v2-20260207-1148 is a 4 billion parameter language model, derived from the Qwen3 architecture and fine-tuned for instruction following. This model is a 16-bit merged version based on unsloth/Qwen3-4B-Instruct-2507, specifically utilizing Supervised Fine-Tuning (SFT) with a LoRA rank of 64. It is designed for general instruction-based tasks, leveraging its 40960 token context length for comprehensive understanding and generation.

Loading preview...

Model Overview

The yusei926/qwen3-4b-sft-merged-v2-20260207-1148 is a 4 billion parameter language model built upon the Qwen3 architecture. This specific iteration is a 16-bit merged model, originating from unsloth/Qwen3-4B-Instruct-2507, and has undergone Supervised Fine-Tuning (SFT).

Key Capabilities

  • Instruction Following: The model is primarily fine-tuned for instruction-based tasks, making it suitable for a wide range of conversational and command-driven applications.
  • Context Handling: With a substantial context length of 40960 tokens, it can process and generate longer sequences of text, aiding in complex interactions and detailed content creation.
  • Efficiency: As a 4B parameter model, it offers a balance between performance and computational efficiency, potentially allowing for more accessible deployment compared to larger models.

Training Details

The model's training involved Supervised Fine-Tuning (SFT) with specific parameters:

  • Learning Rate (LR): 5e-05
  • Epochs: 2
  • LoRA Configuration: A LoRA rank of 64 was used during fine-tuning, indicating an efficient adaptation method.

Notably, Deep Reinforcement Learning from Human Feedback (DPO) was disabled during its training process, focusing solely on the SFT phase for instruction alignment.