mlx-community/DeepSeek-R1-Distill-Qwen-14B

Warm
Public
14B
FP8
32768
Hugging Face
Overview

Overview

The mlx-community/DeepSeek-R1-Distill-Qwen-14B is a 14 billion parameter language model, originally developed by deepseek-ai and subsequently converted to the MLX format by mlx-community. This conversion facilitates optimized performance and inference on Apple silicon, leveraging the mlx-lm library.

Key Characteristics

  • Parameter Count: 14 billion parameters, offering a balance between performance and computational requirements.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling the model to handle long-form text and complex conversational histories.
  • MLX Optimization: Specifically adapted for the MLX framework, ensuring efficient execution on Apple hardware.

Usage

This model is primarily intended for developers working within the Apple ecosystem who require a capable language model for various NLP tasks. Its MLX conversion makes it straightforward to integrate into projects using the mlx-lm library for loading and generating text. The model supports chat templating, allowing for structured conversational interactions.