cs-552-2026-thinkinsidethebox/math_model
The cs-552-2026-thinkinsidethebox/math_model is a specialized language model developed by cs-552-2026-thinkinsidethebox, fine-tuned for advanced mathematical reasoning and problem-solving. This model demonstrates strong performance on mathematical benchmarks, achieving a pass@8 of 0.94 on MATH-500. It is optimized for generating accurate mathematical solutions and extracting boxed answers, making it suitable for complex quantitative tasks.
Loading preview...
Overview
The math_model is the final selected model from the cs-552-2026-thinkinsidethebox project, specifically engineered for mathematical reasoning. It is built upon base weights from exports/sft_omr_asy_diagram_repair_from_v2best_lr1e5_r64_s100_no_think_thinking_eval, indicating a foundation potentially optimized for diagram repair and reasoning without explicit 'thinking' steps.
Key Capabilities & Performance
This model excels in mathematical problem-solving, as evidenced by its performance on the local MATH-500 benchmark (n=8, max_new_tokens=16384):
- pass@8: 0.94
- pass@1: 0.79025
- Boxed Extraction Rate: 0.941, indicating high accuracy in identifying and extracting final answers.
- Accuracy Given Extraction: 0.839798087141339, demonstrating strong correctness of extracted solutions.
- Average Response Token Length: 4186.863, suggesting detailed and comprehensive solution generation.
Recommended Usage
For optimal performance, the following decoding parameters are recommended:
- temperature: 0.55
- top_p: 0.95
- repetition_penalty: 1.03
- max_new_tokens: 16384, allowing for extensive mathematical derivations and explanations.