Qwen3-Swallow-32B-SFT-v0.2 Overview
Qwen3-Swallow-32B-SFT-v0.2 is a 32 billion parameter model from the Qwen3-Swallow family, developed by tokyotech-llm. This model is specifically the Supervised Fine-Tuned (SFT) version, part of a development pipeline that also includes Continual Pre-Training (CPT) and Reinforcement Learning with Verifiable Rewards (RLVR).
Key Capabilities & Features
- Bilingual Proficiency: Highly optimized for both Japanese and English, enhancing cross-lingual understanding and translation.
- Retained STEM Performance: Through strategic CPT and SFT using high-quality math and code datasets, the model successfully prevents catastrophic forgetting in mathematics and coding.
- Enhanced Reasoning: Achieves reasoning performance on par with, and in some tasks surpassing, the original Qwen3 models.
- Robust Training: Developed using a comprehensive training regimen including CPT on 209.7 billion tokens and SFT with 1.1 million samples, with a context size of 32K (32,768).
Good For
- Japanese and English Applications: Ideal for use cases requiring strong performance in both Japanese and English, including translation and bilingual content generation.
- Technical and STEM Tasks: Suitable for applications involving mathematical problem-solving and code generation, where reasoning capabilities are crucial.
- Instruction Following: As an SFT model, it is designed to follow instructions effectively, making it suitable for various conversational and task-oriented applications.