tartuNLP/Llammas-base

Warm
Public
7B
FP8
4096
Feb 16, 2024
License: llama2
Hugging Face
Overview

Overview

Llammas-base is a 7 billion parameter language model built upon the Llama-2 architecture. Developed by tartuNLP, this model has undergone significant continued pre-training using 5 billion tokens from the CulturaX dataset. The dataset composition is notable, consisting of 75% Estonian and 25% English documents, which facilitates cross-lingual knowledge transfer.

Key Capabilities

  • Bilingual Proficiency: Enhanced performance in Estonian due to extensive pre-training on Estonian text, while maintaining English language understanding.
  • Foundation Model: Serves as the base for the instruction-tuned Llammas model.
  • Research-Backed: The model's development and methodology are detailed in a dedicated research paper.

Good For

  • Applications requiring strong Estonian language processing.
  • Research into cross-lingual knowledge transfer and adaptation of large language models.
  • As a foundational model for further fine-tuning on specific Estonian or bilingual tasks.