Xenon-2: Instruction-Tuned Mistral-7B
Xenon-2 is a 7 billion parameter instruction-tuned language model developed by Xenon1, built upon the Mistral-7B-v0.1 base architecture. It incorporates advanced architectural features such as Grouped-Query Attention and Sliding-Window Attention, enhancing its efficiency and performance. The model was fine-tuned using the Ultrafeedback dataset, applying techniques from the "Self-Rewarding Language Models" paper, which aims to improve instruction-following capabilities.
Key Capabilities & Features
- Instruction Following: Optimized for responding to user instructions, leveraging a specific
[INST] and [/INST] token format for prompts. - Mistral-7B Foundation: Benefits from the robust and efficient architecture of Mistral-7B-v0.1.
- Self-Rewarding Fine-tuning: Utilizes advanced fine-tuning methods to enhance its ability to generate helpful and relevant responses.
- 8192-token Context: Supports processing longer inputs and maintaining conversational coherence over extended interactions.
When to Use Xenon-2
- Conversational AI: Ideal for chatbots and interactive agents that require precise instruction adherence.
- General Instruction Tasks: Suitable for a wide range of tasks where clear, concise, and accurate responses to prompts are needed.
- Research & Development: Provides a strong base for further experimentation with self-rewarding fine-tuning techniques.