alquimista888/mixtral_quantized
The alquimista888/mixtral_quantized model is an instruct fine-tuned version of the Mistral-7B-v0.2 Large Language Model, developed by Mistral AI. This 7 billion parameter model features an expanded 32k context window and is optimized for instruction-following tasks. It is designed for developers seeking a powerful yet efficient model for various natural language processing applications.
Loading preview...
Overview
This model, alquimista888/mixtral_quantized, is an instruct fine-tuned variant of the Mistral-7B-v0.2 Large Language Model, originally developed by Mistral AI. It builds upon the Mistral-7B-v0.2 base, which introduced significant improvements over its predecessor, Mistral-7B-v0.1.
Key Enhancements from Mistral-7B-v0.1 to v0.2
- Expanded Context Window: The model now supports a 32k context window, a substantial increase from the 8k context in v0.1, allowing for processing longer inputs and maintaining more conversational history.
- Rope-theta Adjustment: Incorporates a Rope-theta value of 1e6.
- Sliding-Window Attention Removal: The v0.2 base model no longer utilizes Sliding-Window Attention.
Instruction Format
To leverage the instruction fine-tuning effectively, prompts should be enclosed within [INST] and [/INST] tokens. The first instruction requires a begin-of-sentence ID, while subsequent instructions do not. The model's generation is terminated by an end-of-sentence token ID. This format is readily available via the apply_chat_template() method in the Hugging Face Transformers library.
Limitations
As an instruct fine-tuned model, it demonstrates compelling performance but currently lacks built-in moderation mechanisms. The developers are actively seeking community engagement to implement guardrails for moderated outputs, enabling deployment in sensitive environments.