Xenon1/Xenon-4

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 4, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Xenon1/Xenon-4 is a Mistral-7B-v0.1 based language model fine-tuned on the Ultrafeedback dataset. This model utilizes techniques from the "Self-Rewarding Language Models" paper, incorporating Grouped-Query Attention, Sliding-Window Attention, and a byte-fallback BPE tokenizer. It is optimized for instruction-following tasks, providing a robust foundation for conversational AI and text generation applications.

Loading preview...

Overview

Xenon1/Xenon-4 is an instruction-tuned language model built upon the Mistral-7B-v0.1 architecture. It has been fine-tuned using the Ultrafeedback dataset and incorporates methodologies described in the Self-Rewarding Language Models paper. This approach aims to enhance the model's ability to follow instructions effectively and generate high-quality responses.

Key Architectural Features

This model inherits several advanced architectural choices from its Mistral-7B-v0.1 base, contributing to its efficiency and performance:

  • Grouped-Query Attention: Improves inference speed and reduces memory footprint.
  • Sliding-Window Attention: Enables handling longer contexts more efficiently by restricting attention to a local window.
  • Byte-fallback BPE tokenizer: Provides robust tokenization across diverse text inputs, including out-of-vocabulary words.

Instruction Format

Xenon-4 is designed to be used with a specific instruction format, leveraging [INST] and [/INST] tokens to delineate user prompts. This format is compatible with Hugging Face's apply_chat_template() method, simplifying integration into conversational applications. The model expects the first instruction to begin with a begin-of-sentence token and subsequent instructions to follow without it, with assistant generations ending with an end-of-sentence token.

Use Cases

This model is well-suited for applications requiring precise instruction following, such as:

  • Chatbots and conversational agents
  • Automated content generation based on specific prompts
  • Interactive AI systems

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p