HebArabNlpProject/Hebatron_base_long

TEXT GENERATIONConcurrency Cost:2Model Size:30BQuant:FP8Ctx Length:32kPublished:May 3, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

HEBATRON is a 31.6 billion parameter language model developed by PwC Israel, MAFAT, and AWS, specialized for Hebrew. It features a unique hybrid Mamba2 and Mixture-of-Experts (MoE) architecture, providing linear scaling for long-context tasks up to 65,536 tokens. Optimized for native-level reasoning in both Hebrew and English, it excels in advanced Hebrew document analysis and complex bilingual reasoning.

Loading preview...

HEBATRON: Hebrew-Specialized Mamba2-MoE

HEBATRON is a state-of-the-art language model developed through a collaboration between PwC Israel, MAFAT, and AWS, specifically designed for the Hebrew language. It introduces a unique hybrid architecture combining Mamba2 and Mixture-of-Experts (MoE), making it a localized and enhanced version of the Nemotron-3-Nano-30B framework.

Key Capabilities

  • Hybrid Architecture: Combines Mamba2 (SSM) and Sparse MoE for efficient processing.
  • Bilingual Proficiency: Optimized for native-level reasoning in both Hebrew and English.
  • Long Context Window: Supports a 65,536 (64k) token context window, ideal for extensive documents.
  • Specialized Training: Utilizes a three-phase curriculum learning strategy, including formal, colloquial, and long-context data, to handle Hebrew's structural and morphological complexities.
  • Strong Performance: Achieves 91.2% on Hebrew SNLI, 83.3% on GSM8K (Math) in native Hebrew, and 91.6% on English Psychometric Psi.

Intended Use Cases

  • Advanced Hebrew document analysis.
  • Long-context summarization for legal and technical texts.
  • Complex bilingual reasoning tasks.