Overview
Mistral-7B-Instruct-v0.2 is an instruction-tuned variant of the Mistral-7B-v0.2 Large Language Model (LLM) developed by Mistral AI. This model builds upon its predecessor, Mistral-7B-v0.1, by incorporating several key architectural enhancements.
Key Enhancements
- Expanded Context Window: Features a significantly larger 32k context window, an increase from the 8k context in v0.1, allowing for processing longer inputs and maintaining more conversational history.
- Rope-theta Adjustment: Utilizes a modified Rope-theta value of 1e6, which can impact how the model handles positional embeddings.
- Sliding-Window Attention Removal: Unlike v0.1, this version does not employ Sliding-Window Attention.
Instruction Following
The model is fine-tuned to follow instructions effectively. Prompts should be formatted using [INST] and [/INST] tokens, with the first instruction preceded by a begin-of-sentence ID. This instruction format is supported via Hugging Face's apply_chat_template() method.
Limitations
As an instruct fine-tuned model, Mistral-7B-Instruct-v0.2 serves as a demonstration of the base model's capabilities. It currently lacks built-in moderation mechanisms, indicating a need for external guardrails in environments requiring controlled outputs.