LLaMAX/GlotMAX-101-8B-LST

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 29, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

GlotMAX-101-8B-LST is an 8 billion parameter language model developed by LLaMAX, based on the Qwen3-8B architecture with layer-selective tuning. It features a 32768 token context length and excels in multilingual translation, demonstrating over 5 spBLEU points improvement on Flores-101 compared to Qwen3-8B. This model also maintains strong reasoning capabilities across 16 diverse tasks, performing on par with Qwen3 instruct models.

Loading preview...

GlotMAX-101-8B-LST: Multilingual Translation and Reasoning

GlotMAX-101-8B-LST is an 8 billion parameter language model from the LLaMAX series, built upon the Qwen3-8B instruct architecture. It leverages layer-selective tuning with parallel data to enhance its capabilities, particularly in multilingual contexts.

Key Capabilities

  • Exceptional Multilingual Translation: Achieves an average spBLEU score improvement of over 5 points compared to the Qwen3-8B model on the Flores-101 dataset, indicating strong performance across 101 languages.
  • Robust Reasoning: Demonstrates strong reasoning abilities, performing on par with Qwen3 instruct models across 16 diverse reasoning tasks, including bbeh, Livecodebench, and Olymmath.
  • Extensive Language Support: Supports 101 languages, ranging from Afrikaans to Zulu, making it suitable for a wide array of global applications.

What Makes This Model Different?

Unlike many models that prioritize either reasoning or translation, GlotMAX-101-8B-LST is specifically designed to excel in both. Its layer-selective tuning approach allows it to significantly boost translation performance without compromising the strong reasoning capabilities inherited from its Qwen3 base. This makes it a powerful choice for applications requiring both accurate multilingual understanding and complex problem-solving.

Should You Use This?

GlotMAX-101-8B-LST is ideal for use cases demanding high-quality multilingual translation alongside robust general-purpose reasoning. If your application involves processing or generating content in multiple languages, or requires a model that can handle complex logical tasks while also translating effectively, this model offers a compelling balance of capabilities. Its 32768 token context length further supports handling longer, more intricate multilingual inputs.