NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v3

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Oct 13, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v3 is a 7 billion parameter causal language model fine-tuned from Open-Orca/Mistral-7B-OpenOrca. This model leverages the OpenAssistant/oasst_top1_2023-08-25 dataset for instruction tuning and incorporates attention sink mechanisms for potentially improved long-context handling. It is designed for general-purpose conversational AI and instruction-following tasks, with a context length of 4096 tokens.

Loading preview...

Model Overview

NickyNicky/Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v3 is a 7 billion parameter instruction-tuned language model built upon the Open-Orca/Mistral-7B-OpenOrca base model. It has been further fine-tuned using the OpenAssistant/oasst_top1_2023-08-25 dataset, which includes multilingual conversational data across 20 languages.

Key Features & Training Details

  • Base Model: Fine-tuned from Open-Orca/Mistral-7B-OpenOrca.
  • Instruction Tuning: Utilizes the OpenAssistant/oasst_top1_2023-08-25 dataset for enhanced instruction-following capabilities.
  • Attention Sinks: Incorporates attention_sinks technology, potentially improving performance and efficiency for longer contexts by managing attention mechanisms more effectively. This feature can be configured with parameters like attention_sink_size and attention_sink_window_size during inference.
  • Flash Attention: Training and inference can optionally leverage flash-attention for speed optimizations, particularly on supported GPU hardware.
  • Context Length: Supports a maximum context length of 4096 tokens.

Intended Use Cases

This model is suitable for a variety of conversational AI and instruction-following applications, including:

  • General Chatbots: Engaging in open-ended conversations.
  • Question Answering: Providing informative responses to user queries.
  • Content Generation: Assisting with text creation based on prompts.
  • Multilingual Interaction: Benefiting from the diverse language coverage of its training dataset.