argilla/CapybaraHermes-2.5-Mistral-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 30, 2024License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

CapybaraHermes-2.5-Mistral-7B is a 7 billion parameter language model developed by Argilla, preference-tuned from OpenHermes-2.5-Mistral-7B. It is optimized for multi-turn conversational performance, demonstrating improved MTBench Second Turn scores compared to its base model and Mistral-7B-Instruct-v0.2. This model is suitable for chat applications requiring consistent performance across extended dialogues.

Loading preview...

Model Overview

CapybaraHermes-2.5-Mistral-7B is a 7 billion parameter chat model developed by Argilla, built upon the OpenHermes-2.5-Mistral-7B architecture. It has been preference-tuned using LoRA and TRL for 3 epochs on Argilla's dpo mix 7k dataset, and is the launching partner for the capybara-dpo dataset.

Key Capabilities & Performance

This model's primary differentiation lies in its enhanced multi-turn conversational performance. Benchmarking against OpenHermes-2.5-Mistral-7B and Mistral-7B-Instruct-v0.2, CapybaraHermes-2.5-Mistral-7B shows notable improvements in the MTBench Second Turn scores, achieving 7.5625 compared to 7.2875 and 7.1 respectively. It also demonstrates strong overall performance across various benchmarks:

  • AGIEval: 43.8
  • GPT4All: 73.35
  • Bigbench: 42.44
  • MTBench Average: 7.903125

Use Cases

This model is particularly well-suited for applications requiring robust and consistent performance in multi-turn dialogues, such as chatbots, conversational AI agents, and interactive assistants where maintaining context and coherence over several exchanges is crucial.