cjvt/GaMS-9B-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Mar 18, 2025License:gemmaArchitecture:Transformer0.0K Warm

cjvt/GaMS-9B-Instruct is a 9 billion parameter instruction-tuned causal language model developed by the University of Ljubljana, Faculty for Computer and Information Science. Based on Google's Gemma 2 architecture, it is continually pretrained on Slovene, English, Croatian, Bosnian, and Serbian corpora. This model excels in multilingual tasks, particularly for languages of the former Yugoslavia, and is designed for general instruction-following applications.

Loading preview...

GaMS-9B-Instruct: A Multilingual Gemma 2-based LLM

The cjvt/GaMS-9B-Instruct model is a 9 billion parameter instruction-tuned language model, part of the larger GaMS (Generative Model for Slovene) family developed by researchers at the University of Ljubljana. It is built upon Google's Gemma 2 architecture and has undergone extensive continual pre-training on a diverse corpus including Slovene, English, Croatian, Bosnian, and Serbian data, making it particularly adept at handling these languages.

Key Capabilities

  • Multilingual Proficiency: Strong performance in Slovene, English, Croatian, Bosnian, and Serbian, with potential for other Gemma 2-supported languages.
  • Instruction Following: Fine-tuned for general instruction-following tasks, enabling conversational AI and response generation.
  • Robust Training: Continually pre-trained in two stages, including parallel alignment of English-Slovene/Croatian corpora and subsequent training on large, separate language corpora (13.62 billion tokens total).
  • Supervised Fine-tuning (SFT): Trained on approximately 25,000 SFT examples from various datasets, including specialized Slovene instruction datasets and filtered parallel corpora.

Evaluation Highlights

  • Slovenian-LLM-Eval: Demonstrates competitive performance against other models, including base Gemma 2 and SlovenianGPT.
  • SloBench SuperGLUE: Achieves a 0.6997 average on Slovene SuperGLUE tasks in a 0-shot scenario.
  • Translation Tasks: Ranks highly in English-to-Slovene and Slovene-to-English translation benchmarks, outperforming several other models in its class.

Intended Use Cases

  • Content Creation: Generating text in supported languages, including creative formats.
  • Conversational AI: Powering chatbots and virtual assistants, especially for multilingual applications.
  • Research and Education: Serving as a foundation for NLP research and language learning tools focused on the specified languages.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p