GreatGoose/Qwen2.5-0.5B-Instruct-distill-3epoch
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Jan 5, 2026Architecture:Transformer Warm

GreatGoose/Qwen2.5-0.5B-Instruct-distill-3epoch is a 0.5 billion parameter instruction-tuned causal language model, distilled from Qwen/Qwen2.5-0.5B-Instruct. Developed by GreatGoose, this model leverages TRL for training and incorporates GOLD for on-policy distillation. It is designed for general text generation tasks, offering a compact yet capable solution for various applications.

Loading preview...