tartuNLP/Llammas-base
Llammas-base is a 7 billion parameter language model developed by tartuNLP, based on the Llama-2 architecture. It underwent continued pre-training with an additional 5 billion tokens from the CulturaX dataset, comprising 75% Estonian and 25% English documents. This model is specifically designed for cross-lingual knowledge transfer, making it particularly strong in Estonian language processing while retaining English capabilities.
Loading preview...
Overview
Llammas-base is a 7 billion parameter language model built upon the Llama-2 architecture. Developed by tartuNLP, this model has undergone significant continued pre-training using 5 billion tokens from the CulturaX dataset. The dataset composition is notable, consisting of 75% Estonian and 25% English documents, which facilitates cross-lingual knowledge transfer.
Key Capabilities
- Bilingual Proficiency: Enhanced performance in Estonian due to extensive pre-training on Estonian text, while maintaining English language understanding.
- Foundation Model: Serves as the base for the instruction-tuned Llammas model.
- Research-Backed: The model's development and methodology are detailed in a dedicated research paper.
Good For
- Applications requiring strong Estonian language processing.
- Research into cross-lingual knowledge transfer and adaptation of large language models.
- As a foundational model for further fine-tuning on specific Estonian or bilingual tasks.