cs-552-2026-thinkinsidethebox/math_model

TEXT GENERATIONConcurrency Cost:1Model Size:2BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 7, 2026Architecture:Transformer Cold

The cs-552-2026-thinkinsidethebox/math_model is a specialized language model developed by cs-552-2026-thinkinsidethebox, fine-tuned for advanced mathematical reasoning and problem-solving. This model demonstrates strong performance on mathematical benchmarks, achieving a pass@8 of 0.94 on MATH-500. It is optimized for generating accurate mathematical solutions and extracting boxed answers, making it suitable for complex quantitative tasks.

Loading preview...

Overview

The math_model is the final selected model from the cs-552-2026-thinkinsidethebox project, specifically engineered for mathematical reasoning. It is built upon base weights from exports/sft_omr_asy_diagram_repair_from_v2best_lr1e5_r64_s100_no_think_thinking_eval, indicating a foundation potentially optimized for diagram repair and reasoning without explicit 'thinking' steps.

Key Capabilities & Performance

This model excels in mathematical problem-solving, as evidenced by its performance on the local MATH-500 benchmark (n=8, max_new_tokens=16384):

  • pass@8: 0.94
  • pass@1: 0.79025
  • Boxed Extraction Rate: 0.941, indicating high accuracy in identifying and extracting final answers.
  • Accuracy Given Extraction: 0.839798087141339, demonstrating strong correctness of extracted solutions.
  • Average Response Token Length: 4186.863, suggesting detailed and comprehensive solution generation.

Recommended Usage

For optimal performance, the following decoding parameters are recommended:

  • temperature: 0.55
  • top_p: 0.95
  • repetition_penalty: 1.03
  • max_new_tokens: 16384, allowing for extensive mathematical derivations and explanations.