Name: SeongryongJung/qwen2.5-0.5b-ifeval-mixed-kd-alpha05 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SeongryongJung

Model Overview

SeongryongJung/qwen2.5-0.5b-ifeval-mixed-kd-alpha05 is a compact 0.5 billion parameter language model. It was created through a knowledge distillation process, where a smaller student model (Qwen2.5-0.5B-Instruct) learned from a larger teacher model (Qwen2.5-1.5B-Instruct).

Key Capabilities

Instruction Following: The model's training specifically targeted instruction following, utilizing the IFEvalSFTDataset for distillation. This makes it adept at interpreting and executing given instructions.
Knowledge Distillation: The distillation setup involved a distill_alpha of 0.5 and a distill_temperature of 2.0, with an effective loss mix of 50% Cross-Entropy and 50% Knowledge Distillation loss. This method transfers knowledge efficiently from a more capable teacher model.
Performance on IFEval: It demonstrates an observed local IFEval accuracy of 0.4137577002, indicating its proficiency in instruction evaluation tasks.

Good For

Applications requiring a smaller, efficient model with a focus on instruction adherence.
Scenarios where a balance between model size and instruction-following capability is crucial.
Research into knowledge distillation techniques for improving specific task performance in smaller LLMs.

Overview

Model Overview

Key Capabilities

Good For

Full Model Card (README)