tartuNLP/Llama-2-7b-Ukrainian

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:May 29, 2024License:llama2Architecture:Transformer0.0K Open Weights Cold

The tartuNLP/Llama-2-7b-Ukrainian model is a 7 billion parameter bilingual language model, continued pre-trained from Llama-2-7b. It specializes in Ukrainian and English, having been trained on 5 billion tokens with a 75% Ukrainian and 25% English document distribution from the CulturaX dataset. This model is optimized for tasks requiring proficiency in both Ukrainian and English languages.

Loading preview...

Model Overview

tartuNLP/Llama-2-7b-Ukrainian is a 7 billion parameter bilingual model, specifically designed for Ukrainian and English language processing. It is built upon the foundational Llama-2-7b architecture, undergoing continued pre-training to enhance its bilingual capabilities.

Key Capabilities

  • Bilingual Proficiency: Supports both Ukrainian and English, making it suitable for applications requiring understanding or generation in either language.
  • Specialized Training: Continued pre-training on 5 billion tokens from the CulturaX dataset, with a significant emphasis (75%) on Ukrainian documents.
  • Llama-2 Base: Benefits from the robust architecture and general language understanding of the Llama-2-7b model.

Training Details

The model was trained for 19080 steps with a batch size of 256, using AdamW optimizer and bf16 precision. The training utilized a linear learning rate decay from 2e-5 to 2e-6, with a context length of 1024 tokens during this phase.

Good For

  • Applications requiring Ukrainian language generation or comprehension.
  • Bilingual tasks involving translation or cross-lingual understanding between Ukrainian and English.
  • Research into bilingual model adaptation and continued pre-training strategies, as detailed in the associated paper.