cnfusion/QwenPhi-4-0.5b-Draft-mlx-fp16

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:May 3, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

The cnfusion/QwenPhi-4-0.5b-Draft-mlx-fp16 is a 0.5 billion parameter language model, converted by cnfusion to the MLX format from the QwenPhi-4-0.5b-Draft base model. It supports a 32768-token context length and is designed for text generation tasks. This model is notable for its MLX optimization, making it suitable for efficient deployment on Apple Silicon.

Loading preview...

Model Overview

The cnfusion/QwenPhi-4-0.5b-Draft-mlx-fp16 is a 0.5 billion parameter language model, derived from the rdsm/QwenPhi-4-0.5b-Draft base model. It has been specifically converted by cnfusion to the MLX format using mlx-lm version 0.22.3, enabling optimized performance on Apple Silicon hardware.

Key Characteristics

  • Parameter Count: 0.5 billion parameters, offering a compact footprint.
  • Context Length: Supports a substantial 32768-token context window, allowing for processing longer inputs.
  • MLX Optimization: Converted to MLX fp16 format for efficient inference on Apple Silicon devices.
  • Multilingual Support: The base model supports a wide array of languages including Chinese, English, French, Spanish, Portuguese, German, Italian, Russian, Japanese, Korean, Vietnamese, Thai, and Arabic.

Use Cases

This model is primarily intended for text generation tasks. Its MLX optimization makes it particularly suitable for developers looking to deploy small, efficient language models on Apple hardware for applications such as:

  • Local text generation on macOS devices.
  • Experimentation with compact LLMs in an MLX environment.
  • Developing applications that require a multilingual, small-scale language model with good context handling.