minpeter/Qwen3-0.6B-Thinking

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Jan 24, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

minpeter/Qwen3-0.6B-Thinking is a 0.8 billion parameter causal language model, forked from Qwen/Qwen3-0.6B. This version is specifically modified for use as a training target model with PrimeIntellect-ai/verifiers. Its primary differentiation lies in its adaptation for specific training workflows, with changes including the extraction of the chat_template into a separate Jinja file.

Loading preview...

Overview

minpeter/Qwen3-0.6B-Thinking is a specialized variant of the Qwen3-0.6B causal language model, featuring 0.8 billion parameters and a 40960-token context length. This model has been specifically adapted by minpeter for integration into training pipelines, particularly as a target model for the PrimeIntellect-ai/verifiers project.

Key Modifications

The primary change in this fork involves the chat_template. It has been extracted from the tokenizer_config.json file and placed into a separate chat_template.jinja file. This modification aligns the model with the latest transformers format for chat templates, facilitating its use in specific training and verification workflows.

Intended Use

This model is designed for developers and researchers who are working with the PrimeIntellect-ai/verifiers framework and require a Qwen3-0.6B base model with this specific chat_template configuration. For detailed information regarding the original model's architecture, performance, and general usage, users should refer to the official Qwen/Qwen3-0.6B repository.