wangsherpa/qwen2.5-0.5B-math-cot-sft
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Mar 17, 2026Architecture:Transformer Warm

The wangsherpa/qwen2.5-0.5B-math-cot-sft model is a 0.5 billion parameter language model based on the Qwen2.5 architecture, fine-tuned for specific tasks. With a context length of 32768 tokens, it is designed for specialized applications rather than general-purpose use. This model's specific fine-tuning for 'math-cot-sft' suggests an optimization for mathematical reasoning and chain-of-thought tasks. It is intended for developers requiring a compact model for focused analytical or numerical problem-solving.

Loading preview...