Name: tartuNLP/Llammas-base API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tartuNLP

Overview

Llammas-base is a 7 billion parameter language model built upon the Llama-2 architecture. Developed by tartuNLP, this model has undergone significant continued pre-training using 5 billion tokens from the CulturaX dataset. The dataset composition is notable, consisting of 75% Estonian and 25% English documents, which facilitates cross-lingual knowledge transfer.

Key Capabilities

Bilingual Proficiency: Enhanced performance in Estonian due to extensive pre-training on Estonian text, while maintaining English language understanding.
Foundation Model: Serves as the base for the instruction-tuned Llammas model.
Research-Backed: The model's development and methodology are detailed in a dedicated research paper.

Good For

Applications requiring strong Estonian language processing.
Research into cross-lingual knowledge transfer and adaptation of large language models.
As a foundational model for further fine-tuning on specific Estonian or bilingual tasks.