Qwen/Qwen3-Next-80B-A3B-Instruct

Warm
Public
80B
FP8
32768
4
Sep 9, 2025
License: apache-2.0
Hugging Face

Qwen/Qwen3-Next-80B-A3B-Instruct is an 80 billion parameter instruction-tuned causal language model developed by Qwen, featuring a hybrid attention mechanism and high-sparsity Mixture-of-Experts (MoE) architecture. It is designed for efficient context modeling and ultra-long context lengths up to 262,144 tokens natively, with extensibility to 1 million tokens via YaRN. This model excels in parameter efficiency and inference speed, particularly for long-context tasks, and demonstrates strong performance across knowledge, reasoning, coding, and alignment benchmarks.

No reviews yet. Be the first to review!