cjvt/GaMS-27B-Instruct-Nemotron

Warm
Public
27B
FP8
32768
Aug 26, 2025
License: gemma
Hugging Face
Overview

GaMS-27B-Instruct-Nemotron Overview

GaMS-27B-Instruct-Nemotron is an instruction-tuned large language model, building upon the GaMS-27B-Instruct base. This model was developed by Timotej Petrič as part of a master thesis project, focusing on enhancing its capabilities through supervised fine-tuning.

Key Capabilities

  • Bilingual Instruction Following: The model has been specifically fine-tuned on a unique dataset comprising approximately 80,000 Slovenian and 20,000 English instruction-response pairs. This targeted training aims to improve its performance in both languages, with a particular emphasis on Slovenian.
  • Nemotron Dataset Integration: Training utilized a curated subset of the chat portion of the nvidia/Nemotron-Post-Training-Dataset-v1, ensuring a robust foundation for instruction understanding and generation.
  • Slovenian Language Optimization: The training data included significant modifications to adjust identity and context for Slovenian, making it a strong candidate for applications requiring nuanced understanding and generation in Slovenian.

Good For

  • Slovenian Language Applications: Ideal for use cases requiring high-quality instruction following and text generation in Slovenian.
  • Bilingual English-Slovenian Tasks: Suitable for applications that involve switching between or understanding contexts in both English and Slovenian.
  • Research and Development: Provides a specialized model for researchers working on language models with a focus on less-resourced languages like Slovenian, supported by the PoVeJMo research program.