cs-552-2026-middle-west/math_model

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kPublished:May 12, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The cs-552-2026-middle-west/math_model is a 2 billion parameter language model developed by cs-552-2026-middle-west, initialized from Qwen/Qwen3-1.7B. This model is specifically specialized and optimized for mathematical tasks, leveraging a math-focused system prompt and a chat template that encourages a "thinking mode." It is designed for accurate mathematical problem-solving, with final answers expected in a \boxed{...} format.

Loading preview...

Overview

The cs-552-2026-middle-west/math_model is a 2 billion parameter language model, fine-tuned from Qwen/Qwen3-1.7B, with its weights stored in safetensors format for vLLM compatibility. This model is explicitly specialized for mathematical problem-solving.

Key Capabilities & Features

  • Math Specialization: Optimized for mathematical tasks, utilizing a dedicated math-focused system prompt injected via its chat template.
  • Structured Output: Designed to place final answers within a \boxed{...} format, facilitating automated evaluation and clear result presentation.
  • Thinking Mode: The chat template incorporates a "thinking mode" to guide the model's reasoning process during problem-solving.
  • Evaluation Focus: Primarily intended for evaluation on math benchmarks, as part of a course CI, using specific generation parameters (temperature 0.6, top-p 0.95, top-k 20, max new tokens 3584).

Good For

  • Mathematical Problem Solving: Ideal for applications requiring accurate and structured responses to mathematical queries.
  • Educational Tools: Suitable for developing tools that assist in learning or verifying mathematical solutions.
  • Research in Math LLMs: Provides a specialized base for further research and development in language models focused on quantitative reasoning.