Name: HuggingFaceH4/zephyr-7b-alpha API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: HuggingFaceH4

Zephyr-7B-alpha: A Fine-Tuned Assistant Model

Zephyr-7B-alpha is the inaugural model in the Zephyr series, developed by HuggingFaceH4. It is a 7 billion parameter language model, building upon the robust mistralai/Mistral-7B-v0.1 base model. This model is specifically fine-tuned to function as a helpful assistant.

Key Capabilities & Training

Fine-tuning Method: Zephyr-7B-alpha was trained using Direct Preference Optimization (DPO).
Dataset Mix: Training involved a combination of publicly available, synthetic datasets, including an initial fine-tuning on a variant of the UltraChat dataset.
Alignment: Further alignment was performed with 🤗 TRL's DPOTrainer on the openbmb/UltraFeedback dataset, which contains 64k prompts and GPT-4 ranked model completions.
Performance Focus: The model's training intentionally removed some in-built alignment from datasets to boost performance on MT Bench and enhance helpfulness.

Intended Use & Limitations

Zephyr-7B-alpha is primarily intended for chat applications, offering strong performance as a conversational assistant. However, due to the deliberate removal of certain alignment techniques (like RLHF or in-the-loop filtering), the model is more prone to generating problematic outputs if explicitly prompted to do so. Users should be aware of this potential for unaligned responses.

Overview

Zephyr-7B-alpha: A Fine-Tuned Assistant Model

Key Capabilities & Training

Intended Use & Limitations

Full Model Card (README)