edbeeching/Qwen3-4B-Base-SFT-tr5
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 22, 2026Architecture:Transformer Warm

edbeeching/Qwen3-4B-Base-SFT-tr5 is a 4 billion parameter language model fine-tuned from Qwen/Qwen3-4B-Base using the TRL library. This model is specifically trained with Supervised Fine-Tuning (SFT) to enhance its conversational and instruction-following capabilities. It features a substantial 32768-token context length, making it suitable for processing longer prompts and generating coherent, extended responses. Its primary application is in general text generation and conversational AI tasks.

Loading preview...