Name: Writer/palmyra-mini-thinking-b-MLX-BF16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Writer

Palmyra Mini Thinking B - MLX BF16 Overview

This model is a bfloat16 precision version of Writer's Palmyra-mini-thinking-b, specifically optimized for Apple Silicon (M1, M2, M3, M4 series) using the MLX framework. Built on the Qwen2 architecture, it features approximately 1.7 billion parameters and an impressive 131,072 token context window, enhanced by a high RoPE Theta of 1,000,000.0 for superior long-context performance. It utilizes a Qwen2Tokenizer with a 151,936 vocabulary size and adheres to the ChatML conversation format.

Key Capabilities

Advanced Reasoning: Designed for complex thinking tasks, demonstrating strong logical and abstract understanding.
Mathematical Prowess: Achieves 92.50% on AMC23 and 88.20% on MATH500, indicating proficiency in advanced high school mathematics and a wide range of mathematical problems. It also scores 60.00% on AIME24 (pass@1).
Competitive Programming: Scores 63.43% on Codeforces (pass_rate), highlighting its ability to understand algorithmic problems and generate efficient code.
Apple Silicon Optimization: Tailored for performance on Apple's M-series chips, requiring around 2.9GB of memory and 10GB+ RAM for optimal use.
Extended Context: Supports very long input sequences, beneficial for detailed problem-solving and comprehensive analyses.

Good For

Developers working on Apple Silicon platforms who need a performant model for reasoning tasks.
Applications requiring strong mathematical problem-solving capabilities.
Code generation, debugging, and algorithmic design tasks.
Building conversational agents that require multi-step thinking and long-context understanding, especially those using the ChatML format.

Overview

Palmyra Mini Thinking B - MLX BF16 Overview

Key Capabilities

Good For

Full Model Card (README)