Radu1999/Mistral-Instruct-Ukrainian-SFT

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 9, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Radu1999/Mistral-Instruct-Ukrainian-SFT is a 7 billion parameter instruction-tuned language model developed by Radu Chivereanu, based on the Mistral-7B-v0.2 architecture. This model is specifically fine-tuned on various Ukrainian datasets, including UA-SQUAD and Ukrainian StackExchange, to enhance its performance in the Ukrainian language. It utilizes Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer. The model is optimized for instruction-following tasks in Ukrainian, making it suitable for applications requiring natural language understanding and generation in this specific language.

Loading preview...

Overview

Radu1999/Mistral-Instruct-Ukrainian-SFT is an instruction-tuned language model built upon the Mistral-7B-v0.2 architecture. Developed by Radu Chivereanu, this model focuses on providing strong performance for tasks in the Ukrainian language through supervised fine-tuning.

Key Capabilities

  • Ukrainian Language Proficiency: Specifically fine-tuned on a diverse set of Ukrainian datasets, including UA-SQUAD, Ukrainian StackExchange, and a Ukrainian subset of the Belebele Dataset.
  • Instruction Following: Designed to respond effectively to instructions, leveraging the [INST] and [/INST] token format for prompts.
  • Mistral Architecture Benefits: Inherits architectural features from Mistral-7B-v0.2, such as Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer, contributing to efficient processing.

Performance

Evaluations on the Open LLM Leaderboard show competitive performance for a specialized model:

  • Average Score: 62.17
  • HellaSwag (10-Shot): 83.12
  • MMLU (5-Shot): 60.95
  • Winogrande (5-shot): 77.51

Good For

  • Applications requiring instruction-based text generation and understanding in Ukrainian.
  • Researchers and developers working on Ukrainian NLP tasks who need a specialized model.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p