hmdmahdavi/olympiad-curated-qwen3-4b-thinking-distill-30b-5ep-ablation
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 25, 2026Architecture:Transformer Warm

The hmdmahdavi/olympiad-curated-qwen3-4b-thinking-distill-30b-5ep-ablation model is a 4 billion parameter language model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507. Developed by hmdmahdavi, this model leverages a 32768 token context length and was trained using the TRL framework. It is designed for general text generation tasks, building upon its Qwen3 base with specific fine-tuning for improved performance.

Loading preview...