Name: XythicK/Hebrew-GPT API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: XythicK

Hebrew-GPT: Specialized 1B Hebrew Instruction Model

XythicK/Hebrew-GPT is a 1.23 billion parameter instruction-tuned Small Language Model (SLM) built on the Llama 3.2 architecture. It is designed to provide a compact yet powerful solution for Hebrew natural language processing, specifically addressing the challenges of a Morphologically Rich Language (MRL).

Key Capabilities & Features

Linguistic Specialization: Tuned for Hebrew's unique MRL features, including prefix-suffix handling and correct right-to-left (RTL) context awareness.
High Precision: Utilizes Full Merged BFloat16 weights, preserving intelligence from the fine-tuning process without quantization loss.
Instruction Optimized: Trained for complex prompt following, document summarization, and dialogue generation in Hebrew.
Efficiency: Its 1.23 billion parameters make it suitable for high-speed inference and edge deployment on consumer hardware.
Extended Context: Supports a native context length of 128k tokens.

Training Methodology

The model underwent Supervised Fine-Tuning (SFT) using a multi-source dataset strategy:

70% Hebrew Instruction Set: Alpaca-formatted datasets translated and corrected for Hebrew grammar.
20% Hebrew Contextual Knowledge: Fact-based data from Hebrew wikis and structured Q&A.
10% Logic Preservation: High-quality English instructional data to maintain cross-lingual reasoning and mathematical stability.

Limitations

Hallucination: Like other LLMs, it can generate incorrect information; verification is recommended.
Bias: May reflect biases present in its training data.
Parameter Constraints: As a 1B model, it may not perform as well on highly technical academic subjects compared to larger models (70B+).

Overview

Hebrew-GPT: Specialized 1B Hebrew Instruction Model

Key Capabilities & Features

Training Methodology

Limitations

Full Model Card (README)