lmstudio-community/Devstral-Small-2507-MLX-bf16

Warm
Public
24B
FP8
32768
Jul 9, 2025
License: apache-2.0
Hugging Face
Overview

Devstral-Small-2507-MLX-bf16 Overview

This model is a 24 billion parameter variant of mistralai's Devstral-Small-2507, specifically optimized for Apple Silicon. It leverages MLX quantization, provided by the LM Studio team using the mlx_lm framework, to deliver efficient performance on Apple hardware. The original bfloat16 version maintains a substantial 32768 token context length.

Key Capabilities

  • Apple Silicon Optimization: Engineered for efficient local inference on devices powered by Apple Silicon.
  • High Context Length: Supports a 32768 token context, suitable for processing extensive inputs and generating detailed outputs.
  • Bfloat16 Precision: Utilizes bfloat16 for a balance of performance and numerical stability.

Good for

  • Developers and users with Apple Silicon devices seeking high-performance local LLM inference.
  • Applications requiring processing of long documents or complex conversational histories.
  • Experimentation and development within the MLX ecosystem.