xw1234gan/Extended_GRPO_KL_Qwen2.5-3B-Instruct_MATH_beta0_lr1e-05_mb2_ga128_n2048_seed42
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Apr 1, 2026Architecture:Transformer Cold

The xw1234gan/Extended_GRPO_KL_Qwen2.5-3B-Instruct_MATH_beta0_lr1e-05_mb2_ga128_n2048_seed42 model is a 3.1 billion parameter instruction-tuned variant of the Qwen2.5 architecture. This model is specifically fine-tuned for mathematical reasoning and problem-solving tasks. It leverages a 32768 token context length, making it suitable for complex mathematical queries requiring extensive context. Its primary strength lies in its specialized optimization for numerical and logical challenges.

Loading preview...

Model Overview

This model, xw1234gan/Extended_GRPO_KL_Qwen2.5-3B-Instruct_MATH_beta0_lr1e-05_mb2_ga128_n2048_seed42, is a 3.1 billion parameter language model based on the Qwen2.5 architecture. It has been instruction-tuned with a focus on enhancing its capabilities in mathematical reasoning and problem-solving. The model supports a substantial context window of 32768 tokens, allowing it to process and understand lengthy and intricate mathematical problems.

Key Capabilities

  • Mathematical Reasoning: Optimized for handling numerical and logical tasks.
  • Instruction Following: Designed to accurately follow instructions for mathematical queries.
  • Extended Context: Benefits from a 32768 token context length, suitable for complex problems.

Good For

  • Applications requiring strong mathematical problem-solving.
  • Educational tools for math assistance.
  • Research in mathematical reasoning with large contexts.