rishabhrj11/distillspec-qwen600m-xsum
TEXT GENERATIONConcurrency Cost:1Model Size:0.8BQuant:BF16Ctx Length:32kPublished:Dec 6, 2025Architecture:Transformer Loading

The rishabhrj11/distillspec-qwen600m-xsum model is an 0.8 billion parameter language model, fine-tuned using the GKD (On-Policy Distillation of Language Models) method. This model is designed to learn from self-generated mistakes, enhancing its performance through a unique distillation process. It is particularly suited for text generation tasks where refined output quality is desired, leveraging its specialized training approach.

Loading preview...