Fmuaddib/DeepSeek-R1-Distill-Qwen-14B-mlx-fp16

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Mar 27, 2025License:mitArchitecture:Transformer Open Weights Warm

Fmuaddib/DeepSeek-R1-Distill-Qwen-14B-mlx-fp16 is a 14.8 billion parameter language model, converted by Fmuaddib to the MLX format for efficient deployment on Apple silicon. This model is a distilled version of DeepSeek-R1-Distill-Qwen-14B, originally developed by deepseek-ai. It is designed for general language generation tasks, leveraging the Qwen architecture for robust performance in a portable format.

Loading preview...

Model Overview

This model, Fmuaddib/DeepSeek-R1-Distill-Qwen-14B-mlx-fp16, is a 14.8 billion parameter language model. It has been converted by Fmuaddib into the MLX format, specifically optimized for use with Apple silicon, utilizing mlx-lm version 0.22.1. The original model, deepseek-ai/DeepSeek-R1-Distill-Qwen-14B, is a distilled variant based on the Qwen architecture.

Key Characteristics

  • MLX Format: Optimized for performance on Apple silicon, enabling local inference with mlx-lm.
  • Parameter Count: Features 14.8 billion parameters, offering a balance between performance and computational requirements.
  • Architecture: Based on the Qwen architecture, known for its strong general language capabilities.
  • Distilled Model: Represents a distilled version, suggesting potential optimizations for efficiency while retaining core functionalities.

Usage

This model is primarily intended for developers and researchers looking to run DeepSeek-R1-Distill-Qwen-14B on MLX-compatible hardware. It can be loaded and used for text generation tasks via the mlx_lm library, supporting standard prompt-based generation and chat template application if available in the tokenizer.