pa5haw/Phi-4-mini-instruct-mlx-fp16

TEXT GENERATIONConcurrency Cost:1Model Size:3.8BQuant:BF16Ctx Length:32kPublished:May 4, 2026License:mitArchitecture:Transformer Open Weights Cold

The pa5haw/Phi-4-mini-instruct-mlx-fp16 is a 3.8 billion parameter instruction-tuned language model, converted to MLX format from Microsoft's Phi-4-mini-instruct. This model is optimized for efficient deployment and inference on Apple Silicon using the MLX framework. It features a substantial 32768 token context length, making it suitable for tasks requiring extensive contextual understanding and generation.

Loading preview...

Model Overview

The pa5haw/Phi-4-mini-instruct-mlx-fp16 model is an MLX-converted version of Microsoft's Phi-4-mini-instruct. This 3.8 billion parameter instruction-tuned language model is specifically designed for efficient execution on Apple Silicon, leveraging the MLX framework (version 0.31.2).

Key Characteristics

  • Architecture: Based on the Phi-4-mini-instruct model developed by Microsoft.
  • Parameter Count: 3.8 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Features a significant 32768 token context window, enabling it to process and generate longer sequences of text.
  • MLX Optimization: Converted to mlx-fp16 format, making it highly suitable for local inference on Apple devices with MLX support.

Usage

This model is intended for developers and researchers working with Apple Silicon hardware who require a capable instruction-tuned model for various natural language processing tasks. Its MLX optimization ensures efficient performance for applications such as:

  • Instruction following and response generation.
  • Text summarization and completion.
  • Conversational AI and chatbots.
  • Prototyping and development on MLX-compatible systems.