GaMS-9B: A Multilingual Gemma 2-based LLM

GaMS-9B is a 9 billion parameter model developed by a research team at the University of Ljubljana, Faculty for Computer and Information Science. It is part of the GaMS (Generative Model for Slovene) family, which are improved and larger models based on Google's Gemma 2 architecture.

Key Capabilities

Multilingual Proficiency: Continually pretrained on Slovene, English, Croatian, Serbian, and Bosnian corpora, making it adept at handling these languages. It may also support other languages inherent to the base Gemma 2 model.
Translation Performance: The GaMS-9B-Instruct variant shows competitive results in English-to-Slovene and Slovene-to-English translation tasks, ranking among top models like DeepL and Gemini 1.5 Pro on the SloBench leaderboard.
Slovene SuperGLUE: Demonstrates strong performance on Slovene SuperGLUE classification tasks, outperforming other open-source Slovene models like SlovenianGPT.

Training Details

GaMS-9B was continually pre-trained in two stages on the Leonardo HPC using the NVIDIA NeMo 2.0 framework. The training involved parallel alignment using English-Slovene (and some Croatian) corpora, followed by training on separate English, Slovene, Croatian, Bosnian, and Serbian datasets, totaling 13.62 billion tokens in the second stage.

Good For

Applications requiring robust language understanding and generation in Slovene, English, and Balkan languages.
Translation tasks between English and Slovene.
Research and development in multilingual NLP, particularly for less-resourced languages in the Balkan region.

Overview

GaMS-9B: A Multilingual Gemma 2-based LLM

Key Capabilities

Training Details

Good For

Full Model Card (README)