Xenon1/Xenon-1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 4, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Xenon1/Xenon-1 is a 7 billion parameter causal language model based on the Mistral-7B-v0.1 architecture, fine-tuned on the Ultrafeedback dataset. It leverages techniques from the Self-Rewarding Language Models paper to enhance instruction following. With an 8192-token context length, this model is optimized for conversational AI and instruction-based tasks.

Loading preview...

Xenon-1 Model Overview

Xenon-1 is a 7 billion parameter instruction-tuned language model built upon the robust Mistral-7B-v0.1 architecture. This model distinguishes itself by being fine-tuned on the Ultrafeedback dataset, incorporating advanced techniques outlined in the "Self-Rewarding Language Models" paper. This training methodology aims to improve the model's ability to follow instructions effectively and generate high-quality, relevant responses.

Key Capabilities

  • Instruction Following: Enhanced through fine-tuning on the Ultrafeedback dataset, making it suitable for various instruction-based tasks.
  • Mistral-7B-v0.1 Foundation: Benefits from architectural choices like Grouped-Query Attention, Sliding-Window Attention, and a Byte-fallback BPE tokenizer, contributing to efficient processing.
  • Conversational AI: Designed to handle multi-turn conversations using a specific [INST] and [/INST] instruction format, which is supported via Hugging Face's apply_chat_template() method.

Good For

  • Chatbots and Virtual Assistants: Its instruction-tuned nature and conversational format make it well-suited for interactive applications.
  • General Instruction-Based Tasks: Can be used for a wide range of tasks where clear instructions are provided, such as question answering, content generation, and summarization.
  • Research and Development: Provides a strong base for further experimentation and fine-tuning on specific datasets, leveraging its Mistral-7B foundation and self-rewarding training approach.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p