DADA121/qwen2.5-0.5b-math-sft-new
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Apr 11, 2026Architecture:Transformer Cold

DADA121/qwen2.5-0.5b-math-sft-new is a 0.5 billion parameter language model based on the Qwen2.5 architecture, fine-tuned for mathematical tasks. With a context length of 32768 tokens, this model is designed for specialized applications requiring numerical reasoning and problem-solving capabilities. It aims to provide enhanced performance in mathematical domains compared to general-purpose language models of similar size.

Loading preview...

Model Overview

The DADA121/qwen2.5-0.5b-math-sft-new is a compact 0.5 billion parameter language model built upon the Qwen2.5 architecture. This model has undergone supervised fine-tuning (SFT) specifically to enhance its performance in mathematical reasoning and problem-solving tasks. It features a substantial context window of 32768 tokens, allowing it to process longer mathematical problems and related textual information.

Key Capabilities

  • Mathematical Task Specialization: Fine-tuned to excel in mathematical contexts, suggesting improved accuracy and understanding for numerical problems.
  • Extended Context Window: Supports a 32768-token context length, beneficial for complex multi-step mathematical problems or data-rich inputs.
  • Compact Size: At 0.5 billion parameters, it offers a smaller footprint for deployment while focusing on a specialized domain.

Good For

  • Applications requiring a dedicated model for mathematical computations and reasoning.
  • Scenarios where a smaller, specialized model is preferred over larger, general-purpose LLMs for efficiency.
  • Tasks involving processing and generating responses for mathematical queries within a long context.