Name: dimodimodimo/Mistral-7B-Instruct-v0.2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: dimodimodimo

Overview

This model, dimodimodimo/Mistral-7B-Instruct-v0.2, is an instruction-tuned variant of the Mistral-7B-v0.2 base model, developed by Mistral AI. It builds upon the original Mistral-7B architecture, incorporating significant improvements for enhanced performance in instruction-following tasks.

Key Architectural Changes (vs. Mistral-7B-v0.1)

Expanded Context Window: Features a 32,000 token context window, a substantial increase from the 8,000 tokens in v0.1, allowing for processing much longer inputs and generating more extensive outputs.
Rope-theta Update: Utilizes a Rope-theta value of 1e6, which can impact the model's ability to handle longer sequences and improve positional encoding.
No Sliding-Window Attention: This version does not employ Sliding-Window Attention, differentiating its architectural approach from some other models in its class.

Instruction Format

To leverage the instruction fine-tuning effectively, prompts should be enclosed within [INST] and [/INST] tokens. The first instruction requires a beginning-of-sentence ID. This format is supported via Hugging Face's apply_chat_template() method for easy integration.

Limitations

The Mistral 7B Instruct model is presented as a demonstration of the base model's fine-tuning potential. It currently lacks built-in moderation mechanisms, indicating a need for external guardrails in deployment scenarios requiring content moderation.