nfaheem/Marcoroni-7b-DPO-Merge

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 15, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

nfaheem/Marcoroni-7b-DPO-Merge is a 7 billion parameter language model created by nfaheem, formed by merging fblgit/UNA-TheBeagle-7b-v1 and udkai/Turdus with madatnlp/marcoroni-7b-v3-safetensor as the base model. This merge utilizes the TIES method and achieves a notable average score of 74.9 across various benchmarks including ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K. It is designed for general text generation tasks, demonstrating strong performance in reasoning and common sense understanding.

Loading preview...

Marcoroni-7b-DPO-Merge: A High-Performing 7B Model

Marcoroni-7b-DPO-Merge is a 7 billion parameter language model developed by nfaheem, created through a strategic merge of several existing models using mergekit. The base model for this merge is madatnlp/marcoroni-7b-v3-safetensor, combined with fblgit/UNA-TheBeagle-7b-v1 and udkai/Turdus using the TIES merge method.

Key Capabilities & Performance

This model demonstrates strong performance across a suite of benchmarks, achieving an average score of 74.9. Specific benchmark results include:

  • ARC: 73.04
  • HellaSwag: 88.8
  • MMLU: 64.24
  • TruthfulQA: 70.47
  • Winogrande: 85.24
  • GSM8K: 67.63

Notably, as of January 15, 2024, Marcoroni-7b-DPO-Merge was ranked #1 on the HuggingFace Leaderboard among models around the 13B parameter size, outperforming models like mlabonne/Beagle14-7b and udkai/Turdus in overall average score.

Good For

  • General text generation: Its balanced performance across various benchmarks suggests suitability for a wide range of generative AI tasks.
  • Reasoning and common sense tasks: Strong scores in ARC, MMLU, and Winogrande indicate good capabilities in these areas.
  • Developers seeking a highly performant 7B model: Its top ranking on the HuggingFace Leaderboard highlights its competitive edge within its parameter class.