Radu1999/Mistral-Instruct-Ukrainian-slerp
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 12, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Radu1999/Mistral-Instruct-Ukrainian-slerp is a 7 billion parameter instruction-tuned language model, merged from Mistral-7B-Instruct-v0.2 and Radu1999/Mistral-Instruct-Ukrainian-SFT-DPO using the slerp method. This model is specifically optimized for generating responses in Ukrainian, leveraging its base in the Mistral architecture. It is designed for conversational AI and instruction-following tasks in the Ukrainian language, with a context length of 4096 tokens.

Loading preview...

Overview

Radu1999/Mistral-Instruct-Ukrainian-slerp is a 7 billion parameter instruction-tuned language model. It was created by merging two distinct models: the general-purpose mistralai/Mistral-7B-Instruct-v0.2 and the Ukrainian-specific Radu1999/Mistral-Instruct-Ukrainian-SFT-DPO. The merge was performed using the slerp (spherical linear interpolation) method, which combines the strengths of both base models.

This model leverages the robust architecture of Mistral-7B-Instruct-v0.2 while integrating specialized fine-tuning for the Ukrainian language from the SFT-DPO model. The merging process involved specific parameter adjustments for self-attention and MLP layers to achieve an optimal blend.

Key Capabilities

  • Ukrainian Language Proficiency: Enhanced performance for instruction-following and text generation in Ukrainian.
  • Instruction Tuning: Capable of understanding and executing user instructions effectively.
  • Mistral Architecture: Benefits from the efficient and powerful Mistral 7B base model.

Good For

  • Ukrainian Chatbots: Developing conversational AI agents that interact in Ukrainian.
  • Ukrainian Content Generation: Creating text, summaries, or responses in the Ukrainian language.
  • Research and Development: Exploring merged model performance for specific language tasks.