The Writer/palmyra-mini-thinking-a is a 1.7 billion parameter causal language model, fine-tuned from Qwen/Qwen2.5-1.5B, with an extensive 131,072 token context window. Developed by Writer, this model demonstrates exceptional performance in advanced mathematical reasoning and competitive programming, achieving an outstanding 0.886 on the 'MATH500' benchmark. It is specifically optimized for complex quantitative challenges and coding tasks, making it suitable for applications requiring robust problem-solving capabilities.
Palmyra-mini-thinking-a: Math and Coding Specialist
Palmyra-mini-thinking-a is a 1.7 billion parameter language model developed by Writer, building upon the Qwen/Qwen2.5-1.5B architecture. It features a substantial 131,072 token context window, enabling it to process extensive inputs for complex tasks.
Key Capabilities
This model is specifically engineered for advanced mathematical reasoning and competitive programming, showcasing strong performance across several challenging benchmarks:
- Mathematical Reasoning: Achieves an impressive 0.886 on 'MATH500' and 0.8287 on 'gsm8k (strict-match)', indicating proficiency in multi-step arithmetic and complex problem-solving. It also scores 0.8 on 'AMC23'.
- Competitive Programming: Demonstrates competence in generating correct solutions for coding challenges, with a score of 0.5631 on 'Codeforces (pass_rate)' and 0.5481 on 'Olympiadbench (extractive_match)'.
When to Use This Model
Palmyra-mini-thinking-a is an excellent choice for applications requiring:
- Solving intricate mathematical problems and equations.
- Assisting with competitive programming tasks and code generation.
- Processing and reasoning over long contexts due to its large context window.
Users should be mindful of potential biases and inaccuracies, as with any language model, and use it responsibly.