jsfs11/SnorkelWestBeagle-DARETIES-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 25, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

jsfs11/SnorkelWestBeagle-DARETIES-7B is a 7 billion parameter language model merged from Snorkel-Mistral-PairRM-DPO, WestLake-7B-v2, and NeuralBeagle14-7B using the DARE TIES method. Built upon the Mistral-7B-v0.1 architecture, this model leverages a 4096-token context length. It demonstrates strong general reasoning capabilities, achieving an average score of 73.03 on the Open LLM Leaderboard, making it suitable for a variety of general-purpose language tasks.

Loading preview...

Model Overview

jsfs11/SnorkelWestBeagle-DARETIES-7B is a 7 billion parameter language model created by merging three distinct models: snorkelai/Snorkel-Mistral-PairRM-DPO, senseable/WestLake-7B-v2, and mlabonne/NeuralBeagle14-7B. This merge was performed using the DARE TIES method, with mistralai/Mistral-7B-v0.1 serving as the base architecture.

Key Capabilities

  • General Reasoning: The model exhibits solid performance across various reasoning benchmarks, including AI2 Reasoning Challenge (71.16) and Winogrande (83.19).
  • Knowledge & Comprehension: It scores 64.35 on MMLU and 70.05 on TruthfulQA, indicating a good grasp of general knowledge and factual accuracy.
  • Mathematical Reasoning: Achieves 62.09 on GSM8k, suggesting moderate capabilities in mathematical problem-solving.
  • Efficient Merging: Utilizes the DARE TIES merge method, which combines the strengths of multiple models into a single, cohesive unit.

Performance Highlights

Evaluated on the Open LLM Leaderboard, SnorkelWestBeagle-DARETIES-7B achieved an overall average score of 73.03. Notable individual scores include:

  • HellaSwag (10-Shot): 87.35
  • MMLU (5-Shot): 64.35
  • TruthfulQA (0-shot): 70.05

Usage

The model can be easily integrated into Python projects using the transformers library, with example code provided for text generation tasks.