XythicK/Hebrew-GPT
XythicK/Hebrew-GPT is a 1.23 billion parameter instruction-tuned Small Language Model (SLM) based on the Llama 3.2 architecture, specifically engineered for Hebrew natural language understanding and generation. It features 16-bit BFloat16 weights and a 128k token context length, optimized for handling the morphological richness of Hebrew. This model excels at following complex Hebrew prompts, summarizing documents, and engaging in dialogue, making it suitable for efficient edge deployment on standard consumer hardware.
Loading preview...
Hebrew-GPT: Specialized 1B Hebrew Instruction Model
XythicK/Hebrew-GPT is a 1.23 billion parameter instruction-tuned Small Language Model (SLM) built on the Llama 3.2 architecture. It is designed to provide a compact yet powerful solution for Hebrew natural language processing, specifically addressing the challenges of a Morphologically Rich Language (MRL).
Key Capabilities & Features
- Linguistic Specialization: Tuned for Hebrew's unique MRL features, including prefix-suffix handling and correct right-to-left (RTL) context awareness.
- High Precision: Utilizes Full Merged BFloat16 weights, preserving intelligence from the fine-tuning process without quantization loss.
- Instruction Optimized: Trained for complex prompt following, document summarization, and dialogue generation in Hebrew.
- Efficiency: Its 1.23 billion parameters make it suitable for high-speed inference and edge deployment on consumer hardware.
- Extended Context: Supports a native context length of 128k tokens.
Training Methodology
The model underwent Supervised Fine-Tuning (SFT) using a multi-source dataset strategy:
- 70% Hebrew Instruction Set: Alpaca-formatted datasets translated and corrected for Hebrew grammar.
- 20% Hebrew Contextual Knowledge: Fact-based data from Hebrew wikis and structured Q&A.
- 10% Logic Preservation: High-quality English instructional data to maintain cross-lingual reasoning and mathematical stability.
Limitations
- Hallucination: Like other LLMs, it can generate incorrect information; verification is recommended.
- Bias: May reflect biases present in its training data.
- Parameter Constraints: As a 1B model, it may not perform as well on highly technical academic subjects compared to larger models (70B+).