Model Overview
NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2 is a 7-billion parameter language model built upon the Mistral architecture, further fine-tuned from the Open-Orca/Mistral-7B-OpenOrca base model. This iteration specifically leverages the OpenAssistant/oasst_top1_2023-08-25 dataset, which includes multilingual data across 20 languages, enhancing its conversational and instruction-following capabilities.
Key Features & Enhancements
- Base Model: Fine-tuned from
Open-Orca/Mistral-7B-OpenOrca. - Training Data: Utilizes the
OpenAssistant/oasst_top1_2023-08-25 dataset for instruction tuning, covering a broad range of languages (e.g., English, Spanish, German, French, Russian). - Attention Sinks: Incorporates the
attention_sinks technique, which can improve generation efficiency and context handling, particularly for longer sequences. This is configured with an attention_sink_size of 4 and attention_sink_window_size of 1024. - Multilingual Support: Benefits from the diverse language coverage of the OASTT dataset, making it suitable for multilingual applications.
Usage Considerations
This model is designed for tasks requiring instruction following and conversational generation. The integration of attention sinks suggests potential benefits for maintaining coherence over extended dialogues or complex instructions. Example usage demonstrates its ability to generate code snippets and respond to queries in multiple languages.