lightonai/Qwen3-8B-SW-Pivot-EN

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kTool Calling:SupportedPublished:May 11, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The lightonai/Qwen3-8B-SW-Pivot-EN is an 8 billion parameter English-pivoted reasoning model, fine-tuned from Qwen/Qwen3-8B-Base with a 32,768 token context length. Developed by lightonai, this model is designed to receive Swahili questions, generate its entire reasoning trace in English, and then provide the final answer in Swahili. It excels in multilingual reasoning tasks, particularly for Swahili Q&A with English Chain-of-Thought.

Loading preview...

Overview

lightonai/Qwen3-8B-SW-Pivot-EN is an 8 billion parameter language model, part of the Qwen3 family, developed by lightonai. It is specifically fine-tuned from Qwen/Qwen3-8B-Base to address multilingual reasoning challenges, particularly for Swahili questions requiring English-based Chain-of-Thought (CoT) reasoning. The model processes Swahili input, performs its entire reasoning internally in English, and then outputs the final answer in Swahili. This unique "pivot" mechanism is explored in the paper "Rethinking the Multilingual Reasoning Gap with Layer Swap" and aims to leverage English reasoning capabilities for non-English tasks.

Key Capabilities & Features

  • Multilingual Reasoning: Designed for Swahili question answering with English Chain-of-Thought, enabling complex reasoning across language boundaries.
  • High Context Length: Supports a substantial context window of 32,768 tokens, allowing for processing longer inputs and more complex problems.
  • Specialized Training: Fine-tuned on approximately 10 billion tokens over 2 epochs using the lightonai/Dolci-Think-SFT-32B-Multilingual dataset, which includes Swahili Q&A pairs with English CoT.
  • Performance: Achieves an average score of 70.52% across various Swahili benchmarks, including MGSM-Rev2, Global-MMLU-Lite, GPQA-Diamond, AIME 24/25, and HumanEvalPlus, outperforming its Swahili-native and English-only counterparts in overall average.

Ideal Use Cases

  • Cross-lingual Reasoning: Applications requiring reasoning on Swahili content where English CoT is beneficial or preferred for intermediate steps.
  • Research in Multilingual LLMs: Useful for researchers studying the multilingual reasoning gap and the effectiveness of pivoting strategies.
  • Swahili Q&A Systems: Can be integrated into systems that need to answer complex questions in Swahili by leveraging robust English reasoning.