Name: AXCXEPT/Qwen3-EZO-8B-beta API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: AXCXEPT

Overview

AXCXEPT/Qwen3-EZO-8B-beta is an 8-billion-parameter language model built upon the Qwen3-8B architecture. Despite its smaller size, it demonstrates strong performance in multi-turn conversational tasks, achieving MT-Bench scores of 9.08 and JMT-Bench scores of 8.87. This positions its capabilities comparably to larger models such as Gemini 2.5 Flash and GPT-4o, according to internal evaluations.

Key Capabilities

Enhanced Multi-Turn Performance: Significantly improves upon the base Qwen3-8B model for complex, multi-turn interactions.
Deep-Think Technique: Supports parallel processing of deep-thinking prompts, enabling more robust reasoning.
OpenAI API Compatibility: Can be deployed via vLLM, offering compatibility with the OpenAI API for ease of integration.
Efficient Operation: Designed to run on a single A40 GPU, making it accessible for various deployment scenarios.

Benchmarks

Internal evaluations conducted on May 13, 2025, using GPT-4o and Gemini 2.5 Flash as judges, indicate strong performance. These tests were performed on a single A40 GPU, with results potentially varying under different conditions.

Use Cases

This model is particularly well-suited for applications requiring advanced reasoning and handling of intricate, multi-turn dialogues, where its 'Deep-Think' technique can be leveraged for more profound analysis.

Overview

Overview

Key Capabilities

Benchmarks

Use Cases

Full Model Card (README)