TeichAI/Qwen3-8B-Gemini-3-Pro-Preview-Distill
TeichAI/Qwen3-8B-Gemini-3-Pro-Preview-Distill is an 8 billion parameter language model developed by TeichAI, based on the Qwen3 architecture. This model is specifically distilled from a Gemini 3 Pro Preview dataset with a high reasoning effort, making it particularly adept at complex reasoning tasks. It is optimized for applications in coding and scientific domains, leveraging its specialized training for enhanced performance in these areas.
Loading preview...
Overview
TeichAI/Qwen3-8B-Gemini-3-Pro-Preview-Distill is an 8 billion parameter language model built upon the unsloth/Qwen3-8B-unsloth-bnb-4bit base model. Its core differentiator lies in its training methodology: it was distilled from a Gemini 3 Pro Preview dataset, specifically curated for high reasoning effort. This specialized training aims to imbue the model with advanced reasoning capabilities, setting it apart from general-purpose LLMs.
Key Capabilities
- Enhanced Reasoning: Directly trained on a dataset designed for high reasoning, suggesting improved performance on complex logical and analytical tasks.
- Specialized Distillation: Leverages a Gemini 3 Pro Preview dataset, indicating a focus on advanced model distillation techniques.
Good For
- Coding: The model is explicitly recommended for coding applications, likely benefiting from its reasoning capabilities to understand and generate code.
- Science: Suited for scientific use cases, where strong reasoning and analytical skills are crucial for processing and generating scientific content.
Training Details
The model's training involved a dataset named TeichAI/gemini-3-pro-preview-high-reasoning-250x. The training process incurred a cost of $32.7 (USD) and processed a total of 2.73 million tokens (input + output).