papahawk/devi-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 7, 2024License:mitArchitecture:Transformer Open Weights Cold

papahawk/devi-7b is a 7 billion parameter GPT-like language model, forked from Zephyr-7B-β, which is a fine-tuned version of Mistral-7B-v0.1. Developed by papahawk, it is optimized as a helpful assistant through Direct Preference Optimization (DPO) on synthetic datasets. This model excels in chat applications and achieves high rankings on MT-Bench and AlpacaEval benchmarks among 7B models.

Loading preview...

Devi 7B: A DPO-Optimized 7B Assistant Model

Devi 7B is a 7 billion parameter language model, a fork of Zephyr-7B-β, which itself is a fine-tuned version of mistralai/Mistral-7B-v0.1. It was developed by papahawk with significant contributions from HuggingFaceH4's work on Zephyr. The model is primarily English-language and is licensed under MIT.

Key Capabilities & Training:

  • Assistant-Oriented: Trained to function as a helpful assistant.
  • Direct Preference Optimization (DPO): Fine-tuned using DPO on a mix of publicly available, synthetic datasets, including a filtered version of UltraChat and openbmb/UltraFeedback.
  • Performance Focus: The training process specifically removed in-built alignment from some datasets to boost performance on benchmarks like MT-Bench.

Performance Highlights:

  • Top-tier 7B Chat Model: At its release, Zephyr-7B-β (the base for Devi 7B) was the highest-ranked 7B chat model on the MT-Bench and AlpacaEval benchmarks.
  • MT-Bench Score: Achieved a score of 7.34 on MT-Bench, outperforming larger models like Llama2-Chat-70B in several categories.
  • AlpacaEval Win Rate: Demonstrated a 90.60% win rate on AlpacaEval.

Intended Uses & Limitations:

  • Chat Applications: Ideal for chat and conversational AI due to its fine-tuning on diverse synthetic dialogues.
  • Potential for Problematic Outputs: Due to the removal of some safety alignments during training, the model may generate problematic text if explicitly prompted to do so.
  • Complex Tasks: Lags behind proprietary models in complex tasks such as coding and mathematics.