itstechuse/akeno-mergedv8
The itstechuse/akeno-mergedv8 is a 7 billion parameter instruct fine-tuned large language model based on the Mistral-7B-v0.2 architecture, developed by Mistral AI. It features an expanded 32k context window and modified Rope-theta for enhanced performance. This model is optimized for following instructions and generating coherent text, making it suitable for a wide range of conversational AI and text generation tasks.
Loading preview...
Model Overview
The itstechuse/akeno-mergedv8 model is an instruct fine-tuned variant of the Mistral-7B-v0.2 Large Language Model, originally developed by Mistral AI. This 7 billion parameter model builds upon its predecessor with significant architectural improvements.
Key Enhancements
- Expanded Context Window: Features a 32k token context window, a substantial increase from the 8k context in Mistral-7B-v0.1, allowing for processing longer inputs and generating more extensive outputs.
- Rope-theta Adjustment: Incorporates a
Rope-theta = 1e6modification. - No Sliding-Window Attention: Unlike previous versions, this model does not utilize Sliding-Window Attention.
Instruction Format
To leverage its instruction-tuned capabilities, prompts should be enclosed within [INST] and [/INST] tokens. The model is designed to follow this specific instruction format, which is also supported via Hugging Face's apply_chat_template() method for seamless integration.
Limitations
As an instruct fine-tuned model, akeno-mergedv8 is a demonstration of the base model's performance potential. It currently lacks built-in moderation mechanisms, and users are encouraged to consider this for deployment in environments requiring moderated outputs. For more detailed information, refer to the original paper and release blog post by Mistral AI.