Name: OpenDataArena/Qwen3-8B-ODA-Mixture-500k API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: OpenDataArena

Model Overview

OpenDataArena/Qwen3-8B-ODA-Mixture-500k is an 8 billion parameter supervised fine-tuned (SFT) model built upon the Qwen3-8B-Base architecture. Developed by OpenDataArena, this model leverages the ODA-Mixture-500k dataset, a meticulously curated collection of approximately 500,000 samples. The dataset was assembled by integrating high-quality, efficient corpora from the OpenDataArena leaderboard, including specialized data for math, code, general knowledge, and reasoning.

Key Capabilities

Multi-domain Reasoning: Significantly improves general, mathematical, coding, and reasoning abilities compared to its base model.
Data Curation Methodology: Benefits from a unique training data curation pipeline that prioritizes top-performing datasets from the OpenDataArena leaderboard, followed by rigorous deduplication and benchmark decontamination.
Balanced Coverage: Employs semantic clustering and uniform sampling during data selection to ensure a broad and balanced representation of reasoning tasks, maximizing generalization.
Performance Gains: Achieves an average score of 72.8 across the ODA benchmark suite, outperforming the base model (53.2) and other SFT models on various metrics, particularly in General (71.2) and Reasoning (69.7) domains.

When to Use This Model

Complex Problem Solving: Ideal for applications requiring strong performance in general, mathematical, and logical reasoning tasks.
Code Generation and Understanding: Suitable for scenarios demanding robust coding capabilities, as evidenced by its strong performance in the code domain.
Research in Data-Centric AI: Valuable for researchers interested in the impact of high-quality, curated datasets on model performance, particularly those following the OpenDataArena methodology.