Xenon1/Xenon-2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 4, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Xenon1/Xenon-2 is a 7 billion parameter instruction-tuned causal language model based on the Mistral-7B-v0.1 architecture, featuring Grouped-Query Attention and Sliding-Window Attention. It was fine-tuned on the Ultrafeedback dataset using self-rewarding language model techniques. This model is designed for instruction-following tasks, leveraging its 8192-token context length for conversational applications.

Loading preview...

Xenon-2: Instruction-Tuned Mistral-7B

Xenon-2 is a 7 billion parameter instruction-tuned language model developed by Xenon1, built upon the Mistral-7B-v0.1 base architecture. It incorporates advanced architectural features such as Grouped-Query Attention and Sliding-Window Attention, enhancing its efficiency and performance. The model was fine-tuned using the Ultrafeedback dataset, applying techniques from the "Self-Rewarding Language Models" paper, which aims to improve instruction-following capabilities.

Key Capabilities & Features

  • Instruction Following: Optimized for responding to user instructions, leveraging a specific [INST] and [/INST] token format for prompts.
  • Mistral-7B Foundation: Benefits from the robust and efficient architecture of Mistral-7B-v0.1.
  • Self-Rewarding Fine-tuning: Utilizes advanced fine-tuning methods to enhance its ability to generate helpful and relevant responses.
  • 8192-token Context: Supports processing longer inputs and maintaining conversational coherence over extended interactions.

When to Use Xenon-2

  • Conversational AI: Ideal for chatbots and interactive agents that require precise instruction adherence.
  • General Instruction Tasks: Suitable for a wide range of tasks where clear, concise, and accurate responses to prompts are needed.
  • Research & Development: Provides a strong base for further experimentation with self-rewarding fine-tuning techniques.