Name: Ebumping/Qwen3-32B-Fable-Distill API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: Ebumping

Model Overview

Ebumping/Qwen3-32B-Fable-Distill is a 32 billion parameter Qwen3 model, version 0.2, developed by Ebumping. It has been fine-tuned using Supervised Fine-Tuning (SFT) with the TRL framework, specifically on a dataset of 4,207 examples containing reasoning traces distilled from frontier models.

Key Differentiators (v0.2)

Preserved Reasoning Traces: Unlike its predecessor (v0.1), this version maintains distinct <think> blocks for reasoning, preventing reasoning steps from being flattened into the final generation.
Assistant-Only Loss: Training loss is calculated exclusively on assistant tokens, which can lead to more focused and efficient learning for response generation.
Curated Training Data: The model was trained on a refined dataset where CoT-less examples were removed, and Claude channel data was converted to the Qwen3 <think> format.

Training Details

The model was trained for 789 steps on a unsloth/qwen3-32b-bnb-4bit base model, utilizing LoRA with a rank of 64. The merged weights are in BF16 precision. It supports a context length of 32768 tokens.

VRAM Requirements

Users should note the significant VRAM requirements, with the BF16 merged format needing 80 GB+ and quantized GGUF formats ranging from 20 GB (Q4_K_M) to 40 GB+ (Q8_0).

Use Cases

This model is particularly well-suited for applications requiring explicit, step-by-step reasoning, where the preservation of thought processes is crucial for understanding or debugging model outputs.

Overview

Model Overview

Key Differentiators (v0.2)

Training Details

VRAM Requirements

Use Cases

Full Model Card (README)