filipealmeida/Mistral-7B-Instruct-v0.1-sharded

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Sep 28, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The filipealmeida/Mistral-7B-Instruct-v0.1-sharded model is an 8 billion parameter instruction-tuned language model, sharded for efficient deployment with limited CPU memory. Developed by Mistral AI, it is based on the Mistral-7B-v0.1 architecture, featuring Grouped-Query Attention and Sliding-Window Attention. This model is optimized for following instructions and generating conversational text, making it suitable for various dialogue-based applications.

Loading preview...

Overview

This model is a sharded version of the Mistral-7B-Instruct-v0.1 Large Language Model, designed to be usable even with limited CPU memory. It is an instruction-tuned variant of the original Mistral-7B-v0.1 generative text model, fine-tuned using a diverse set of publicly available conversation datasets.

Key Capabilities

  • Instruction Following: Optimized to understand and execute instructions provided within [INST] and [/INST] tags.
  • Conversational AI: Excels at generating coherent and contextually relevant responses in dialogue scenarios.
  • Efficient Deployment: The sharded nature allows for deployment in environments with memory constraints.
  • Advanced Architecture: Incorporates architectural innovations like Grouped-Query Attention and Sliding-Window Attention for improved performance and efficiency.

Instruction Format

To leverage its instruction-following capabilities, prompts should be formatted with [INST] and [/INST] tokens. The first instruction requires a beginning-of-sentence ID, while subsequent instructions do not. The model's generation is terminated by an end-of-sentence token ID.

Good For

  • Building chatbots and conversational agents.
  • Instruction-based text generation tasks.
  • Applications requiring a powerful language model with efficient memory usage.