BRlkl/distill-sft-qwen3-4b-full
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 27, 2026Architecture:Transformer Warm

BRlkl/distill-sft-qwen3-4b-full is a 4 billion parameter instruction-tuned causal language model, fine-tuned from unsloth/Qwen3-4B-Instruct-2507. Developed by BRlkl, this model leverages Supervised Fine-Tuning (SFT) with TRL for enhanced performance. It is designed for general text generation tasks, offering a 32768 token context length.

Loading preview...