BRlkl/distill-sft-qwen3-0.6b-full
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Mar 27, 2026Architecture:Transformer Warm
BRlkl/distill-sft-qwen3-0.6b-full is a 0.8 billion parameter language model, fine-tuned from unsloth/Qwen3-0.6B using SFT (Supervised Fine-Tuning) with the TRL framework. This model is designed for general text generation tasks, leveraging its compact size and 32768-token context length for efficient deployment. It provides a base for various natural language processing applications requiring a smaller, fine-tuned model.
Loading preview...