Name: orai-nlp/Gemma-Kimu-9b-base API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: orai-nlp

Gemma-Kimu-9b-base: Basque Language Adaptation

Gemma-Kimu-9b-base is a 9 billion parameter large language model developed by orai-nlp, specifically designed for the Basque language. It is built upon Google’s Gemma-2-9b foundational model and has undergone continual pre-training to adapt its linguistic capabilities to Basque.

Key Capabilities & Features

Basque Language Specialization: Significantly improves performance in Basque language understanding, coherence, and text generation fluency compared to the original Gemma-2-9b.
Dual-Language Training: Enhanced through continual pre-training on a combination of the large-scale ZelaiHandi dataset (Basque monolingual data) and a subset of the FineWeb dataset (English replay).
Base Model: Serves as a foundational model for further instruction-tuning and task-specific adaptations, such as the Gemma-Kimu-9b-it instruction-tuned version.
Syntactic, Lexical, and Morphological Competence: Training methodology specifically targets the enhancement of these linguistic aspects in Basque.

Good For

Developers and researchers working on Basque natural language processing (NLP) tasks.
As a strong base model for fine-tuning on specific Basque-language applications.
Projects requiring a large language model with improved proficiency in Basque while retaining general English capabilities.

Overview

Gemma-Kimu-9b-base: Basque Language Adaptation

Key Capabilities & Features

Good For

Full Model Card (README)