Name: BlueMoonlight/Qwen3-4B-Instruct-2507-mlx-fp16 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: BlueMoonlight

Model Overview

BlueMoonlight/Qwen3-4B-Instruct-2507-mlx-fp16 is a 4 billion parameter instruction-tuned language model, specifically adapted for the MLX framework. This model is a conversion of the original Qwen/Qwen3-4B-Instruct-2507, performed by BlueMoonlight using mlx-lm version 0.29.1. The primary purpose of this conversion is to enable efficient inference on Apple Silicon devices, leveraging the MLX ecosystem.

Key Characteristics

Parameter Count: 4 billion parameters, offering a balance between performance and computational requirements.
Instruction-Tuned: Designed to follow instructions effectively, making it suitable for a wide range of conversational and task-oriented applications.
MLX Format: Optimized for deployment and execution on Apple Silicon (e.g., M1, M2, M3 chips) via the MLX library, providing native performance benefits.
Context Length: Supports a substantial context window of 40960 tokens, allowing for processing and generating longer texts while maintaining coherence.

Use Cases

This model is particularly well-suited for developers and users who:

Require a capable instruction-following model for local inference on Apple Silicon hardware.
Are working on applications that benefit from a large context window for complex queries or document processing.
Need a model for general text generation, summarization, question answering, and conversational AI tasks within the MLX ecosystem.

Overview

Model Overview

Key Characteristics

Use Cases

Full Model Card (README)