DistilQwen2.5-0.5B-Instruct is a 0.5 billion parameter instruction-tuned causal language model developed by alibaba-pai, distilled from the Qwen2.5-0.5B-Instruct architecture. It leverages advanced black-box and white-box knowledge distillation techniques, including difficulty scoring and task-related resampling, to transfer capabilities from stronger LLMs. Trained primarily on Chinese and English instruction datasets, this model is optimized for efficient performance in conversational AI and general instruction-following tasks, supporting a context length of 131072 tokens.
No reviews yet. Be the first to review!