Name: TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: TeichAI

Model Overview

TeichAI/Qwen3-4B-Thinking-2507-Gemini-2.5-Flash-Distill is a 4 billion parameter model built upon the Qwen3 architecture. It was developed by TeichAI through a distillation process, fine-tuned on approximately 54.4 million tokens generated by Gemini 2.5 Flash. The primary goal of this training was to transfer the behavior, reasoning traces, output style, and knowledge of the larger Gemini-2.5 Flash model into a more compact form.

Key Capabilities & Performance

This model shows improved performance compared to its base model, unsloth/Qwen3-4B-Thinking-2507, across several benchmarks. It achieved higher scores in:

ARC Challenge: 0.511945 (vs 0.486348)
GPQA Diamond Zeroshot: 0.353535 (vs 0.30303)
HellaSwag: 0.504382 (vs 0.479785)
MMLU: 0.661587 (vs 0.65532)
Winogrande: 0.65588 (vs 0.64562)

The training dataset covered a wide range of categories, including Academia, Finance, Health, Legal, Marketing, Programming, SEO, and Science, indicating a broad knowledge base. The model was trained using Unsloth and Huggingface's TRL library, enabling faster fine-tuning.

Ideal Use Cases

This model is particularly well-suited for applications where:

Mimicking the output style and reasoning of Gemini-2.5 Flash is beneficial.
Tasks require knowledge across diverse domains such as finance, health, legal, and programming.
Resource-efficient deployment of a model with distilled advanced capabilities is desired.

Overview

Model Overview

Key Capabilities & Performance

Ideal Use Cases

Full Model Card (README)