codelion/Qwen3-0.6B-ICM-DPO-mlx-fp16

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Jul 10, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The codelion/Qwen3-0.6B-ICM-DPO-mlx-fp16 is a 0.8 billion parameter language model, converted by codelion to the MLX format for efficient deployment on Apple silicon. This model is based on the Qwen3 architecture and incorporates Instruction-tuned, CoT (Chain-of-Thought), and DPO (Direct Preference Optimization) training, making it suitable for instruction-following tasks. Its MLX conversion allows for optimized performance on Apple hardware, targeting local inference applications.

Loading preview...

Overview

The codelion/Qwen3-0.6B-ICM-DPO-mlx-fp16 model is a 0.8 billion parameter language model, specifically converted by codelion into the MLX format. This conversion optimizes the model for efficient inference on Apple silicon, leveraging the MLX framework's capabilities.

Key Characteristics

  • Architecture: Based on the Qwen3 model family.
  • Parameter Count: 0.8 billion parameters, offering a balance between performance and computational efficiency.
  • Training Methodology: Incorporates Instruction-tuned, CoT (Chain-of-Thought), and DPO (Direct Preference Optimization) techniques, enhancing its ability to follow instructions and generate coherent, preferred responses.
  • MLX Conversion: Optimized for Apple silicon, enabling local, high-performance inference.
  • Context Length: Supports a context window of 32768 tokens, allowing for processing longer inputs and generating more extensive outputs.

Use Cases

This model is particularly well-suited for:

  • Instruction Following: Excels at tasks requiring precise adherence to given instructions due to its DPO and instruction-tuned training.
  • Local Deployment: Ideal for developers and users looking to run language models efficiently on Apple hardware without relying on cloud services.
  • Experimentation: Provides a compact yet capable model for experimenting with MLX-optimized language models and their performance on local devices.