Name: SherlockAssistant/Mistral-7B-Instruct-Ukrainian API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SherlockAssistant

Overview

SherlockAssistant/Mistral-7B-Instruct-Ukrainian is a 7-billion parameter instruction-tuned large language model built upon the Mistral-7B-v0.2 architecture. It has been specifically optimized for the Ukrainian language through a multi-stage fine-tuning process. This process involved initial fine-tuning with structured and unstructured Ukrainian datasets, followed by an SLERP merge with the CultriX/NeuralTrix-7B-v1 model, and concluded with DPO (Direct Preference Optimization).

Key Capabilities

Ukrainian Language Proficiency: Specialized in understanding and generating text in Ukrainian, making it suitable for localized applications.
Instruction Following: Designed to respond accurately to instructions, leveraging its instruction fine-tuning.
Question Answering: Trained on datasets like UA-SQUAD and Ukrainian StackExchange, enhancing its ability to answer questions.
Context Handling: Features an 8192-token context window, allowing for processing longer inputs.

Training Details

The model's training incorporated diverse Ukrainian datasets:

Structured Datasets: Includes UA-SQUAD, Ukrainian StackExchange, UAlpaca Dataset, Ukrainian subsets from Belebele and XQA, and the ZNO Dataset.
Unstructured Datasets: Utilized Ukrainian Wikipedia for broad language understanding.
DPO: Applied on a Ukrainian translation of the distilabel-indel-orca-dpo-pairs dataset to align with human preferences.

Usage

Users should format prompts using [INST] and [/INST] tokens to leverage the instruction fine-tuning effectively. A chat template is available via the apply_chat_template() method for ease of use.

Overview

Overview

Key Capabilities

Training Details

Usage

Full Model Card (README)