AlisonWenNCTU/sft-qwen2.5-7b-generate-thinking-no-guideline
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jan 26, 2026Architecture:Transformer Cold

AlisonWenNCTU/sft-qwen2.5-7b-generate-thinking-no-guideline is a 7.6 billion parameter language model based on the Qwen2.5-7B architecture. It has been fine-tuned for 3 epochs on the nvidia/Nemotron-Cascade-SFT-Stage-2 instruction following dataset, achieving a final loss of 0.07696. This model is designed for general instruction following tasks, leveraging its base architecture and specialized SFT training.

Loading preview...