Name: lightonai/Qwen3-8B-SW-Pivot-EN API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: lightonai

Overview

lightonai/Qwen3-8B-SW-Pivot-EN is an 8 billion parameter language model, part of the Qwen3 family, developed by lightonai. It is specifically fine-tuned from Qwen/Qwen3-8B-Base to address multilingual reasoning challenges, particularly for Swahili questions requiring English-based Chain-of-Thought (CoT) reasoning. The model processes Swahili input, performs its entire reasoning internally in English, and then outputs the final answer in Swahili. This unique "pivot" mechanism is explored in the paper "Rethinking the Multilingual Reasoning Gap with Layer Swap" and aims to leverage English reasoning capabilities for non-English tasks.

Key Capabilities & Features

Multilingual Reasoning: Designed for Swahili question answering with English Chain-of-Thought, enabling complex reasoning across language boundaries.
High Context Length: Supports a substantial context window of 32,768 tokens, allowing for processing longer inputs and more complex problems.
Specialized Training: Fine-tuned on approximately 10 billion tokens over 2 epochs using the lightonai/Dolci-Think-SFT-32B-Multilingual dataset, which includes Swahili Q&A pairs with English CoT.
Performance: Achieves an average score of 70.52% across various Swahili benchmarks, including MGSM-Rev2, Global-MMLU-Lite, GPQA-Diamond, AIME 24/25, and HumanEvalPlus, outperforming its Swahili-native and English-only counterparts in overall average.

Ideal Use Cases

Cross-lingual Reasoning: Applications requiring reasoning on Swahili content where English CoT is beneficial or preferred for intermediate steps.
Research in Multilingual LLMs: Useful for researchers studying the multilingual reasoning gap and the effectiveness of pivoting strategies.
Swahili Q&A Systems: Can be integrated into systems that need to answer complex questions in Swahili by leveraging robust English reasoning.

Overview

Overview

Key Capabilities & Features

Ideal Use Cases

Full Model Card (README)