AXCXEPT/Qwen3-EZO-8B-beta

Warm
Public
8B
FP8
32768
May 10, 2025
License: apache-2.0
Hugging Face
Overview

Overview

AXCXEPT/Qwen3-EZO-8B-beta is an 8-billion-parameter language model built upon the Qwen3-8B architecture. Despite its smaller size, it demonstrates strong performance in multi-turn conversational tasks, achieving MT-Bench scores of 9.08 and JMT-Bench scores of 8.87. This positions its capabilities comparably to larger models such as Gemini 2.5 Flash and GPT-4o, according to internal evaluations.

Key Capabilities

  • Enhanced Multi-Turn Performance: Significantly improves upon the base Qwen3-8B model for complex, multi-turn interactions.
  • Deep-Think Technique: Supports parallel processing of deep-thinking prompts, enabling more robust reasoning.
  • OpenAI API Compatibility: Can be deployed via vLLM, offering compatibility with the OpenAI API for ease of integration.
  • Efficient Operation: Designed to run on a single A40 GPU, making it accessible for various deployment scenarios.

Benchmarks

Internal evaluations conducted on May 13, 2025, using GPT-4o and Gemini 2.5 Flash as judges, indicate strong performance. These tests were performed on a single A40 GPU, with results potentially varying under different conditions.

Use Cases

This model is particularly well-suited for applications requiring advanced reasoning and handling of intricate, multi-turn dialogues, where its 'Deep-Think' technique can be leveraged for more profound analysis.