Writer/palmyra-mini-thinking-b-MLX-BF16
The Writer/palmyra-mini-thinking-b-MLX-BF16 is a 1.7 billion parameter causal language model based on the Qwen2 architecture, developed by Writer. Optimized for Apple Silicon using the MLX framework, this bfloat16 precision model excels in advanced reasoning, mathematical tasks, and competitive programming challenges. It features an extended context window of 131,072 tokens and a refined ChatML chat template, making it suitable for complex problem-solving and multi-turn conversations.
Loading preview...
Palmyra Mini Thinking B - MLX BF16 Overview
This model is a bfloat16 precision version of Writer's Palmyra-mini-thinking-b, specifically optimized for Apple Silicon (M1, M2, M3, M4 series) using the MLX framework. Built on the Qwen2 architecture, it features approximately 1.7 billion parameters and an impressive 131,072 token context window, enhanced by a high RoPE Theta of 1,000,000.0 for superior long-context performance. It utilizes a Qwen2Tokenizer with a 151,936 vocabulary size and adheres to the ChatML conversation format.
Key Capabilities
- Advanced Reasoning: Designed for complex thinking tasks, demonstrating strong logical and abstract understanding.
- Mathematical Prowess: Achieves 92.50% on AMC23 and 88.20% on MATH500, indicating proficiency in advanced high school mathematics and a wide range of mathematical problems. It also scores 60.00% on AIME24 (pass@1).
- Competitive Programming: Scores 63.43% on Codeforces (pass_rate), highlighting its ability to understand algorithmic problems and generate efficient code.
- Apple Silicon Optimization: Tailored for performance on Apple's M-series chips, requiring around 2.9GB of memory and 10GB+ RAM for optimal use.
- Extended Context: Supports very long input sequences, beneficial for detailed problem-solving and comprehensive analyses.
Good For
- Developers working on Apple Silicon platforms who need a performant model for reasoning tasks.
- Applications requiring strong mathematical problem-solving capabilities.
- Code generation, debugging, and algorithmic design tasks.
- Building conversational agents that require multi-step thinking and long-context understanding, especially those using the ChatML format.