sulthankris/WAIANG-Qwen3-4B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Dec 12, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

The sulthankris/WAIANG-Qwen3-4B is a 4 billion parameter Qwen3 model, developed by sulthankris, fine-tuned using Unsloth and Huggingface's TRL library. It was trained on a synthetic dataset, emphasizing efficient training. This model is suitable for applications requiring a compact yet capable language model with a 40960 token context length.

Loading preview...

Model Overview

The sulthankris/WAIANG-Qwen3-4B is a 4 billion parameter language model based on the Qwen3 architecture, developed by sulthankris. It was fine-tuned from unsloth/qwen3-4b-bnb-4bit using the Unsloth library, which enabled 2x faster training, and Huggingface's TRL library.

Key Characteristics

  • Architecture: Qwen3-4B
  • Parameter Count: 4 billion
  • Context Length: 40960 tokens
  • Training: Fine-tuned on a 21.5 MB synthetic dataset, primarily generated from tngtech/DeepSeek-TNG-R1T2-Chimera.
  • Efficiency: Leverages Unsloth for accelerated training.

Potential Use Cases

This model is well-suited for applications where a compact and efficiently trained language model is beneficial. Its fine-tuning on a synthetic dataset suggests potential for tasks aligned with the data's characteristics, offering a balance between performance and resource efficiency.