GaMS-9B: A Multilingual Gemma 2-based LLM
GaMS-9B is a 9 billion parameter model developed by a research team at the University of Ljubljana, Faculty for Computer and Information Science. It is part of the GaMS (Generative Model for Slovene) family, which are improved and larger models based on Google's Gemma 2 architecture.
Key Capabilities
- Multilingual Proficiency: Continually pretrained on Slovene, English, Croatian, Serbian, and Bosnian corpora, making it adept at handling these languages. It may also support other languages inherent to the base Gemma 2 model.
- Translation Performance: The GaMS-9B-Instruct variant shows competitive results in English-to-Slovene and Slovene-to-English translation tasks, ranking among top models like DeepL and Gemini 1.5 Pro on the SloBench leaderboard.
- Slovene SuperGLUE: Demonstrates strong performance on Slovene SuperGLUE classification tasks, outperforming other open-source Slovene models like SlovenianGPT.
Training Details
GaMS-9B was continually pre-trained in two stages on the Leonardo HPC using the NVIDIA NeMo 2.0 framework. The training involved parallel alignment using English-Slovene (and some Croatian) corpora, followed by training on separate English, Slovene, Croatian, Bosnian, and Serbian datasets, totaling 13.62 billion tokens in the second stage.
Good For
- Applications requiring robust language understanding and generation in Slovene, English, and Balkan languages.
- Translation tasks between English and Slovene.
- Research and development in multilingual NLP, particularly for less-resourced languages in the Balkan region.