Palmyra-mini: A Compact Model for Advanced Reasoning
Palmyra-mini, developed by Writer, is a 1.7 billion parameter English language model fine-tuned from Qwen/Qwen2.5-1.5B, featuring an extensive 131,072 token context window. This model is specifically designed to demonstrate exceptional capabilities in complex reasoning and mathematical problem-solving.
Key Capabilities
- Mathematical Proficiency: Achieves impressive scores on various math benchmarks, including 0.818 on gsm8k (strict-match) and MATH500, indicating strong ability to parse and solve grade-school-level and advanced quantitative problems.
- Complex Reasoning: Shows robust performance on the BBH (get-answer) benchmark (0.5259), part of the Big-Bench Hard suite, highlighting its capacity for multi-faceted reasoning tasks.
- Competition-Level Math: Scores 0.6 on the AMC23 benchmark, demonstrating its aptitude for challenging, competition-level mathematics.
Intended Use
This model is primarily intended for research and development in generative AI, particularly for applications demanding strong mathematical and logical reasoning. Users should be aware of potential limitations regarding biased or inaccurate information, as with any language model.
Benchmark Highlights
Palmyra-mini's performance across various benchmarks underscores its strengths:
- gsm8k (strict-match): 0.818
- MATH500: 0.818
- AMC23: 0.6
- BBH (get-answer)(exact_match): 0.5259
These results position Palmyra-mini as a powerful tool for tasks requiring deep understanding and multi-step thought processes within a compact model size.