OpenLLM-Ro/RoGemma-7b-Instruct

TEXT GENERATIONConcurrency Cost:1Model Size:8.5BQuant:FP8Ctx Length:8kPublished:Oct 10, 2024License:cc-by-nc-4.0Architecture:Transformer Open Weights Cold

OpenLLM-Ro/RoGemma-7b-Instruct is an instruction-tuned generative text model developed by OpenLLM-Ro, specialized for the Romanian language. Fine-tuned from Google's Gemma-7b, this model is part of the first open-source effort to build large language models specifically for Romanian. It excels in Romanian language tasks, demonstrating strong performance across various academic and downstream benchmarks, including MT-Bench and RoCulturaBench. The model is primarily intended for research use in Romanian natural language processing and assistant-like chat applications.

Loading preview...

OpenLLM-Ro/RoGemma-7b-Instruct: Romanian Language LLM

OpenLLM-Ro/RoGemma-7b-Instruct is a generative text model developed by OpenLLM-Ro, specifically designed and optimized for the Romanian language. It is an instruction-tuned variant of Google's Gemma-7b, representing a significant open-source initiative to create specialized LLMs for Romanian.

Key Capabilities and Features

  • Romanian Language Specialization: This model is built from the ground up for Romanian, utilizing a collection of Romanian-specific instruction datasets like RoAlpaca, RoDolly, and RoOrca for fine-tuning.
  • Instruction-Tuned Performance: As an instruct model, it is designed for assistant-like chat and various natural language tasks in Romanian.
  • Strong Benchmark Results: The model demonstrates competitive performance on Romanian benchmarks, including an average score of 6.28 on MT-Bench and 3.65 on RoCulturaBench, often outperforming its base Gemma-1.1-7b-it model in Romanian contexts.
  • Research-Oriented Development: Developed as part of the OpenLLM-Ro project, it aims to foster research and development in Romanian NLP.

Intended Use Cases

  • Research: Ideal for academic and research purposes focused on Romanian natural language processing.
  • Assistant-like Chat: Suitable for building conversational agents and chatbots that interact in Romanian.
  • Natural Language Tasks: Can be adapted for a variety of Romanian NLP tasks, leveraging its specialized training.

Limitations

  • Language Specificity: Primarily intended for use in Romanian; performance in other languages is not guaranteed and is considered out-of-scope.
  • License Restrictions: Licensed under cc-by-nc-4.0, which may restrict commercial use.