Name: Jackrong/GPT-Distill-Qwen3-8B-Thinking API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Jackrong

Model Overview

Jackrong/GPT-Distill-Qwen3-8B-Thinking is an 8 billion parameter instruction-tuned and reasoning-enhanced language model built upon the Qwen3-8B base. It features a 16,384 token context window and supports both English and Chinese. This model was developed using Supervised Fine-Tuning (SFT) with Unsloth and incorporates knowledge distillation from large-scale reasoning models (120B/235B class).

Key Differentiators

"Thinking" Capability: Explicitly trained to generate internal reasoning chains, wrapped in <think>...</think> tags, before providing a final answer. This significantly improves performance on complex math, logic, and scientific tasks.
Distilled Intelligence: Inherits advanced reasoning patterns from high-intelligence teacher models (GPT-OSS-120B and Qwen3-235B), allowing an 8B model to mimic the problem-solving approaches of much larger architectures.
Long Context: Processes extensive documents and conversations up to 16K tokens, making it suitable for tasks requiring broad contextual understanding.
Efficient Size: Offers high performance in an 8B parameter footprint, optimized for lower VRAM usage.

Recommended Use Cases

Complex Reasoning: Ideal for math problems, logical puzzles, and scientific derivations, leveraging its CoT mechanism.
Long-Context Tasks: Processing and understanding information from lengthy texts or dialogues.
Instruction Following: Adheres well to intricate user instructions and constraints.
Multilingual NLP: Fluent generation and understanding in both Chinese and English.

Training Details

The model was fine-tuned on approximately 88,000 high-quality examples, including specialized datasets for reasoning and CoT, ShareGPT for conversational flow, and instruction following. The training specifically focused on modeling assistant behavior by training only on responses.

Overview

Model Overview

Key Differentiators

Recommended Use Cases

Training Details

Full Model Card (README)