Name: tokyotech-llm/Qwen3-Swallow-8B-RL-v0.2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tokyotech-llm

Qwen3-Swallow-8B-RL-v0.2 Overview

The Qwen3-Swallow-8B-RL-v0.2 is an 8 billion parameter model from the Qwen3-Swallow family, developed by tokyotech-llm. This model is a bilingual Japanese-English LLM, refined through a multi-stage training process including Continual Pre-Training (CPT), Supervised Fine-Tuning (SFT), and Reinforcement Learning with Verifiable Rewards (RLVR). The development focused on enhancing Japanese language and Japanese-English translation capabilities.

Key Capabilities

Bilingual Proficiency: Highly optimized for both Japanese and English language tasks.
Retained STEM Performance: Strategic CPT and SFT pipelines prevented catastrophic forgetting in mathematics and coding, utilizing high-quality math and code datasets with reasoning traces.
Enhanced Reasoning: Achieves reasoning performance on par with, and in some cases surpassing, the original Qwen3 models, particularly in math and coding.
RL-Tuned: The 'RL' in its name signifies its refinement through Reinforcement Learning, further boosting its reasoning abilities.

Good For

Applications requiring strong Japanese and English language understanding and generation.
Tasks involving Japanese-English translation.
Use cases demanding robust mathematical and coding reasoning capabilities.
Developers seeking a model with enhanced reasoning performance in STEM fields, built upon the Qwen3 architecture.