Overview

Mistral-7B-Instruct-v0.2 is an instruction-tuned variant of the Mistral-7B-v0.2 Large Language Model (LLM) developed by Mistral AI. This model builds upon its predecessor, Mistral-7B-v0.1, by incorporating several key architectural enhancements.

Key Enhancements

Expanded Context Window: Features a significantly larger 32k context window, an increase from the 8k context in v0.1, allowing for processing longer inputs and maintaining more conversational history.
Rope-theta Adjustment: Utilizes a modified Rope-theta value of 1e6, which can impact how the model handles positional embeddings.
Sliding-Window Attention Removal: Unlike v0.1, this version does not employ Sliding-Window Attention.

Instruction Following

The model is fine-tuned to follow instructions effectively. Prompts should be formatted using [INST] and [/INST] tokens, with the first instruction preceded by a begin-of-sentence ID. This instruction format is supported via Hugging Face's apply_chat_template() method.

Limitations

As an instruct fine-tuned model, Mistral-7B-Instruct-v0.2 serves as a demonstration of the base model's capabilities. It currently lacks built-in moderation mechanisms, indicating a need for external guardrails in environments requiring controlled outputs.