Xenon1/Xenon-2
Xenon1/Xenon-2 is a 7 billion parameter instruction-tuned causal language model based on the Mistral-7B-v0.1 architecture, featuring Grouped-Query Attention and Sliding-Window Attention. It was fine-tuned on the Ultrafeedback dataset using self-rewarding language model techniques. This model is designed for instruction-following tasks, leveraging its 8192-token context length for conversational applications.
Loading preview...
Xenon-2: Instruction-Tuned Mistral-7B
Xenon-2 is a 7 billion parameter instruction-tuned language model developed by Xenon1, built upon the Mistral-7B-v0.1 base architecture. It incorporates advanced architectural features such as Grouped-Query Attention and Sliding-Window Attention, enhancing its efficiency and performance. The model was fine-tuned using the Ultrafeedback dataset, applying techniques from the "Self-Rewarding Language Models" paper, which aims to improve instruction-following capabilities.
Key Capabilities & Features
- Instruction Following: Optimized for responding to user instructions, leveraging a specific
[INST]and[/INST]token format for prompts. - Mistral-7B Foundation: Benefits from the robust and efficient architecture of Mistral-7B-v0.1.
- Self-Rewarding Fine-tuning: Utilizes advanced fine-tuning methods to enhance its ability to generate helpful and relevant responses.
- 8192-token Context: Supports processing longer inputs and maintaining conversational coherence over extended interactions.
When to Use Xenon-2
- Conversational AI: Ideal for chatbots and interactive agents that require precise instruction adherence.
- General Instruction Tasks: Suitable for a wide range of tasks where clear, concise, and accurate responses to prompts are needed.
- Research & Development: Provides a strong base for further experimentation with self-rewarding fine-tuning techniques.