Name: hadadxyz/Qwen3-8B-Ultra-Distilled API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: hadadxyz

Overview

hadadxyz/Qwen3-8B-Ultra-Distilled is an 8 billion parameter language model, fine-tuned from the base Qwen/Qwen3-8B model. Its primary goal is to significantly improve the base model's ability to perform complex, step-by-step reasoning and broaden its general instruction-following capabilities. The model was trained using Supervised Fine Tuning (SFT) with Parameter Efficient Fine Tuning (PEFT) on a single NVIDIA A100 80GB GPU.

Key Capabilities

Enhanced Reasoning: Trained on a curated dataset of reasoning traces distilled from advanced AI models like Claude Opus 4.6, Gemini 3 Pro, and GPT 5.2. This includes detailed chain-of-thought examples across mathematics, science, coding, and logic.
Improved Instruction Following: Incorporates a diverse set of instruction-response pairs, including an "uncensored" dataset to reduce unnecessary refusals and improve engagement with a wider range of legitimate requests.
Context Length: Supports a substantial context length of 40,960 tokens, allowing for processing and generating longer, more complex interactions.
Efficient Training: Achieved its specialized capabilities with an estimated training time of 6-9 hours, demonstrating efficient resource utilization.

Good For

Complex Problem Solving: Ideal for applications requiring the model to "think through" problems step-by-step before providing a final answer.
Analytical Tasks: Excels in domains like mathematics, scientific inquiry, and coding where logical deduction and structured reasoning are crucial.
Broad Instruction Adherence: Suitable for general-purpose instruction following, especially where nuanced understanding and reduced refusal rates are desired.

Overview

Overview

Key Capabilities

Good For

Full Model Card (README)