tartuNLP/Llammas-base

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 16, 2024License:llama2Architecture:Transformer0.0K Open Weights Cold

Llammas-base is a 7 billion parameter language model developed by tartuNLP, based on the Llama-2 architecture. It underwent continued pre-training with an additional 5 billion tokens from the CulturaX dataset, comprising 75% Estonian and 25% English documents. This model is specifically designed for cross-lingual knowledge transfer, making it particularly strong in Estonian language processing while retaining English capabilities.

Loading preview...

Overview

Llammas-base is a 7 billion parameter language model built upon the Llama-2 architecture. Developed by tartuNLP, this model has undergone significant continued pre-training using 5 billion tokens from the CulturaX dataset. The dataset composition is notable, consisting of 75% Estonian and 25% English documents, which facilitates cross-lingual knowledge transfer.

Key Capabilities

  • Bilingual Proficiency: Enhanced performance in Estonian due to extensive pre-training on Estonian text, while maintaining English language understanding.
  • Foundation Model: Serves as the base for the instruction-tuned Llammas model.
  • Research-Backed: The model's development and methodology are detailed in a dedicated research paper.

Good For

  • Applications requiring strong Estonian language processing.
  • Research into cross-lingual knowledge transfer and adaptation of large language models.
  • As a foundational model for further fine-tuning on specific Estonian or bilingual tasks.