argilla/notus-7b-v1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Nov 16, 2023License:mitArchitecture:Transformer0.1K Open Weights Cold

Argilla's Notus-7b-v1 is a 7 billion parameter GPT-like causal language model, fine-tuned using Direct Preference Optimization (DPO) on a curated version of the UltraFeedback dataset. This model, based on Zephyr-7b-sft-full, is optimized for chat applications and assistant-like interactions. It demonstrates competitive performance, surpassing Zephyr-7B-beta and Claude 2 on the AlpacaEval benchmark, making it suitable for high-quality conversational AI.

Loading preview...

Notus 7B v1: DPO Fine-tuned Chat Model

Notus 7B v1, developed by Argilla, is a 7 billion parameter GPT-like model fine-tuned with Direct Preference Optimization (DPO). It builds upon zephyr-7b-sft-full, the base model for zephyr-7b-beta, but distinguishes itself through a meticulously curated preference dataset. Argilla identified and rectified data quality issues within the original UltraFeedback dataset, creating a binarized version based on preference ratings rather than critique scores.

Key Capabilities & Performance

  • Enhanced Chat Performance: Notus 7B v1 excels in chat-like applications, outperforming Zephyr-7B-beta and Claude 2 on the AlpacaEval benchmark with a 91.42% win rate, while maintaining comparable MT-Bench scores.
  • Improved Academic Benchmarks: It shows stronger performance on the Open LLM Leaderboard, achieving a higher average score (52.89) and better results in ARC, HellaSwag, MMLU, and Winogrande compared to Zephyr 7B dDPO.
  • Data-First Approach: The model's superior performance is attributed to Argilla's "data-first" strategy, focusing on high-quality, verified training data.

Training & Data Curation

Notus was trained using a new, curated version of the openbmb/UltraFeedback dataset, specifically argilla/ultrafeedback-binarized-preferences. This involved identifying and correcting mismatches between overall_score and actual response quality in the original dataset, leveraging Argilla's data annotation tools. The model primarily supports English and uses the same prompt template as HuggingFaceH4/zephyr-7b-beta.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p