TeichAI/Qwen3-4B-Instruct-2507-Gemini-3-Pro-Preview-Distill

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Dec 24, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

TeichAI/Qwen3-4B-Instruct-2507-Gemini-3-Pro-Preview-Distill is a 4 billion parameter instruction-tuned Qwen3 model developed by TeichAI. It was fine-tuned on a Gemini 3 Pro Preview dataset, specifically focusing on final answers after reasoning summaries were removed. This model is optimized for applications in coding and scientific domains, leveraging its training on a high-reasoning dataset without direct reasoning output.

Loading preview...

Overview

TeichAI/Qwen3-4B-Instruct-2507-Gemini-3-Pro-Preview-Distill is a 4 billion parameter instruction-tuned model based on the Qwen3 architecture. Developed by TeichAI, this model was fine-tuned using a specialized Gemini 3 Pro Preview dataset. A unique aspect of its training involved removing the reasoning summaries from the dataset, with the model being fine-tuned exclusively on the final answers. This approach aims to distill direct answer generation capabilities from a dataset originally rich in reasoning effort.

Key Capabilities

  • Specialized Training: Fine-tuned on a Gemini 3 Pro Preview dataset, focusing on direct answers.
  • Efficient Training: Utilized Unsloth and Huggingface's TRL library for 2x faster training.
  • Base Model: Built upon unsloth/Qwen3-4B-Instruct-2507.

Good For

  • Coding: Designed to perform well in programming-related tasks.
  • Science: Suitable for applications within scientific domains.