lmstudio-community/Devstral-Small-2507-MLX-bf16
Devstral-Small-2507-MLX-bf16 is a 24 billion parameter model from mistralai, optimized for Apple Silicon using MLX quantization. This bfloat16 version of Devstral-Small-2507 offers a 32768 token context length, making it suitable for tasks requiring substantial context processing on Apple hardware. Its primary differentiator is its specific optimization for MLX, enabling efficient local inference on Apple devices.
Loading preview...
Devstral-Small-2507-MLX-bf16 Overview
This model is a 24 billion parameter variant of mistralai's Devstral-Small-2507, specifically optimized for Apple Silicon. It leverages MLX quantization, provided by the LM Studio team using the mlx_lm framework, to deliver efficient performance on Apple hardware. The original bfloat16 version maintains a substantial 32768 token context length.
Key Capabilities
- Apple Silicon Optimization: Engineered for efficient local inference on devices powered by Apple Silicon.
- High Context Length: Supports a 32768 token context, suitable for processing extensive inputs and generating detailed outputs.
- Bfloat16 Precision: Utilizes bfloat16 for a balance of performance and numerical stability.
Good for
- Developers and users with Apple Silicon devices seeking high-performance local LLM inference.
- Applications requiring processing of long documents or complex conversational histories.
- Experimentation and development within the MLX ecosystem.