SakanaAI/TinySwallow-1.5B
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Dec 26, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm
TinySwallow-1.5B is a 1.5 billion parameter Japanese compact language model developed by Sakana AI and the Swallow Team. It was created using Temporally Adaptive Interpolated Distillation (TAID), a novel knowledge distillation method, with Qwen2.5-32B-Instruct as the teacher model and Qwen2.5-1.5B-Instruct as the student. The model has undergone further pre-training on Japanese text data, specifically enhancing its Japanese language capabilities for research and development purposes.
Loading preview...