TeichAI/Qwen3-4B-Thinking-2507-Gemini-3-Pro-Preview-High-Reasoning-Distill

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Nov 24, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

TeichAI/Qwen3-4B-Thinking-2507-Gemini-3-Pro-Preview-High-Reasoning-Distill is a 4 billion parameter language model developed by TeichAI, based on the Qwen3 architecture. It was specifically trained on a Gemini 3 Pro Preview dataset with a focus on high reasoning effort. This model is optimized for tasks requiring strong analytical capabilities, particularly in coding and scientific domains, and features a 40960 token context length.

Loading preview...

Model Overview

TeichAI/Qwen3-4B-Thinking-2507-Gemini-3-Pro-Preview-High-Reasoning-Distill is a 4 billion parameter language model from TeichAI, built upon the unsloth/Qwen3-4B-Thinking-2507 base model. Its key differentiator is its training on a specialized Gemini 3 Pro Preview dataset engineered for high reasoning effort.

Key Capabilities

  • Enhanced Reasoning: The model's training on a high-reasoning dataset suggests improved performance on complex analytical tasks.
  • Specialized Training Data: Utilizes the TeichAI/gemini-3-pro-preview-high-reasoning-250x dataset, indicating a focus on advanced problem-solving.

Good For

This model is particularly well-suited for use cases in:

  • Coding: Its reasoning-focused training can benefit code generation, debugging, and understanding complex programming logic.
  • Science: Applicable to scientific problem-solving, data analysis, and understanding scientific concepts where high reasoning is crucial.

Training Details

The training process involved a dataset with 2.73 million total tokens (input + output) and incurred a cost of $32.7 USD, highlighting a targeted and efficient training approach for its specific reasoning objective.