tokyotech-llm/Qwen3-Swallow-32B-SFT-v0.2
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Jan 25, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Qwen3-Swallow-32B-SFT-v0.2 is a 32 billion parameter instruction-tuned large language model developed by tokyotech-llm, built upon the Qwen3 architecture. This bilingual Japanese-English model excels in Japanese language proficiency and Japanese-English translation, while maintaining strong performance in math and coding tasks. It was developed through Continual Pre-Training (CPT), Supervised Fine-Tuning (SFT), and Reinforcement Learning with Verifiable Rewards (RLVR) to enhance reasoning capabilities.

Loading preview...