Qwen/Qwen3-8B-MLX-bf16

Warm
Public
8B
FP8
32768
1
May 23, 2025
License: apache-2.0
Hugging Face

Qwen/Qwen3-8B-MLX-bf16 is an 8.2 billion parameter causal language model from the Qwen series, developed by Qwen. This model uniquely supports seamless switching between a 'thinking mode' for complex logical reasoning, math, and coding, and a 'non-thinking mode' for efficient general-purpose dialogue. It delivers enhanced reasoning capabilities, superior human preference alignment for creative writing and role-playing, and strong agent capabilities with multilingual support for over 100 languages.

Overview

Qwen3-8B-MLX-bf16 Overview

Qwen3-8B-MLX-bf16 is an 8.2 billion parameter causal language model developed by Qwen, part of the latest Qwen3 series. It features a unique capability to seamlessly switch between two operational modes: a thinking mode optimized for complex logical reasoning, mathematics, and code generation, and a non-thinking mode for efficient, general-purpose dialogue. This dual-mode functionality allows for optimal performance across diverse scenarios.

Key Capabilities

  • Enhanced Reasoning: Significantly improves performance in mathematics, code generation, and commonsense logical reasoning compared to previous Qwen models, especially in thinking mode.
  • Human Preference Alignment: Excels in creative writing, role-playing, multi-turn dialogues, and instruction following, providing a more natural and engaging conversational experience.
  • Agent Capabilities: Demonstrates strong expertise in integrating with external tools, achieving leading performance among open-source models for complex agent-based tasks.
  • Multilingual Support: Supports over 100 languages and dialects, offering robust capabilities for multilingual instruction following and translation.
  • Extended Context: Natively supports a context length of 32,768 tokens, extendable up to 131,072 tokens using the YaRN method for processing long texts.

When to Use This Model

This model is ideal for applications requiring dynamic reasoning capabilities, such as complex problem-solving, code generation, or agentic workflows, where the ability to toggle between deep thought and efficient dialogue is beneficial. It is also well-suited for multilingual applications and creative content generation.