hmdmahdavi/olympiad-curated-qwen3-4b-thinking-distill-30b
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Jan 11, 2026Architecture:Transformer Warm

The hmdmahdavi/olympiad-curated-qwen3-4b-thinking-distill-30b is a 4 billion parameter language model, fine-tuned from Qwen/Qwen3-4B-Thinking-2507. This model leverages a 40960-token context length and was trained using the TRL framework. It is designed for general text generation tasks, building upon the Qwen3 architecture.

Loading preview...