Hyeongwon/P2-split2_bs256_prob_Qwen3-4B-Base_0317-01
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 17, 2026Architecture:Transformer Warm

The Hyeongwon/P2-split2_bs256_prob_Qwen3-4B-Base_0317-01 is a 4 billion parameter language model, fine-tuned from the Hyeongwon/Qwen3-4B-Base architecture. This model was trained using Supervised Fine-Tuning (SFT) with the TRL framework, making it suitable for general text generation tasks. It offers a substantial context length of 32768 tokens, enhancing its ability to handle longer inputs and generate coherent, extended responses.

Loading preview...