Model Overview
tartuNLP/Llama-2-7b-Ukrainian is a 7 billion parameter bilingual model, specifically designed for Ukrainian and English language processing. It is built upon the foundational Llama-2-7b architecture, undergoing continued pre-training to enhance its bilingual capabilities.
Key Capabilities
- Bilingual Proficiency: Supports both Ukrainian and English, making it suitable for applications requiring understanding or generation in either language.
- Specialized Training: Continued pre-training on 5 billion tokens from the CulturaX dataset, with a significant emphasis (75%) on Ukrainian documents.
- Llama-2 Base: Benefits from the robust architecture and general language understanding of the Llama-2-7b model.
Training Details
The model was trained for 19080 steps with a batch size of 256, using AdamW optimizer and bf16 precision. The training utilized a linear learning rate decay from 2e-5 to 2e-6, with a context length of 1024 tokens during this phase.
Good For
- Applications requiring Ukrainian language generation or comprehension.
- Bilingual tasks involving translation or cross-lingual understanding between Ukrainian and English.
- Research into bilingual model adaptation and continued pre-training strategies, as detailed in the associated paper.