Name: tokyotech-llm/Qwen3-Swallow-32B-RL-v0.2 API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: tokyotech-llm

Qwen3-Swallow-32B-RL-v0.2 Overview

Qwen3-Swallow-32B-RL-v0.2 is a 32 billion parameter model from the Qwen3-Swallow family, developed by tokyotech-llm. This model is the result of a multi-stage training process including Continual Pre-Training (CPT), Supervised Fine-Tuning (SFT), and Reinforcement Learning with Verifiable Rewards (RLVR), building upon the Qwen3 architecture. Its primary focus is on strong bilingual Japanese-English capabilities while significantly enhancing or maintaining performance in STEM fields.

Key Capabilities

Bilingual Proficiency: Optimized for high performance in both Japanese and English, including Japanese-English translation.
Enhanced Reasoning: Achieves reasoning performance comparable to, and in some tasks, surpassing the original Qwen3 models, particularly in mathematics and coding.
Retained STEM Performance: Strategic use of high-quality math and code datasets during CPT and SFT prevented catastrophic forgetting, ensuring robust performance in these areas.
RL-Enhanced: Further improved math and coding reasoning through Reinforcement Learning with Verifiable Rewards.

Good For

Applications requiring strong Japanese and English language understanding and generation.
Tasks involving complex mathematical problem-solving and code generation.
Use cases where robust reasoning capabilities are critical, especially in bilingual contexts.