dicta-il/DictaLM-3.0-24B-Base
DictaLM-3.0-24B-Base is a 24-billion-parameter base language model developed by Dicta, initialized from Mistral-Small-3.1-24B-Base-2503. This model is part of the Dicta-LM 3.0 collection, trained on extensive Hebrew and English corpora, and sets a new state-of-the-art for its weight class in Hebrew language processing. It is designed as a foundational model for further fine-tuning, particularly for applications requiring strong Hebrew language capabilities.
Loading preview...
DictaLM-3.0-24B-Base: A New Frontier for Hebrew LLMs
DictaLM-3.0-24B-Base is a 24-billion-parameter foundational language model from Dicta, representing a significant advancement in Hebrew sovereign LLMs. This model was initialized from Mistral-Small-3.1-24B-Base-2503 and has been extensively trained on large Hebrew and English text corpora. It establishes a new state-of-the-art for its parameter class in Hebrew language performance, both as a base model and for subsequent chat model fine-tuning.
Key Capabilities & Features
- Bilingual Proficiency: Strong performance in both Hebrew and English, with a particular focus on Hebrew.
- State-of-the-Art Hebrew Performance: Achieves leading benchmarks for its size in Hebrew language tasks.
- Base Model Design: Intended for further fine-tuning to create specialized applications, including chat models.
- Full Precision: Available in BF16 precision.
Use Cases & Considerations
This model is ideal for developers and researchers looking to build applications that require robust Hebrew language understanding and generation. As a base model, it provides a powerful starting point for custom fine-tuning for specific tasks. Users should note that this is not an instruction-tuned chat model and lacks built-in moderation mechanisms, requiring developers to implement their own safety measures for downstream applications. For more details, refer to the technical report.