jerrycheng233/model6_gspo_qwen3_16bit
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Mar 27, 2026License:apache-2.0Architecture:Transformer Open Weights Cold
The jerrycheng233/model6_gspo_qwen3_16bit is a 14 billion parameter Qwen3-based causal language model developed by jerrycheng233. It was fine-tuned from TeichAI/Qwen3-14B-Claude-4.5-Opus-High-Reasoning-Distill using Unsloth and Huggingface's TRL library, resulting in 2x faster training. This model is optimized for high-reasoning tasks, leveraging its foundation for advanced cognitive capabilities.
Loading preview...