Gille/StrangeMerges_48-7B-dare_ties

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Mar 26, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Gille/StrangeMerges_48-7B-dare_ties is a 7 billion parameter language model created by Gille, formed by merging three distinct models: StrangeMerges_46-7B-dare_ties, Percival_01-7b-slerp, and StrangeMerges_47-7B-dare_ties, using the dare_ties merge method. This model is built upon the Locutusque/Hercules-4.0-Mistral-v0.2-7B base and achieves an average score of 57.89 on the Open LLM Leaderboard, demonstrating capabilities across various reasoning and language understanding tasks. It is suitable for general-purpose text generation and understanding applications where a 7B parameter model is appropriate.

Loading preview...

Model Overview

Gille/StrangeMerges_48-7B-dare_ties is a 7 billion parameter language model developed by Gille. It is a product of merging three different models: Gille/StrangeMerges_46-7B-dare_ties, AurelPx/Percival_01-7b-slerp, and Gille/StrangeMerges_47-7B-dare_ties. This merge was performed using the dare_ties method via LazyMergekit, with Locutusque/Hercules-4.0-Mistral-v0.2-7B serving as its base model.

Key Capabilities & Performance

This model demonstrates a balanced performance across various benchmarks, achieving an average score of 57.89 on the Open LLM Leaderboard. Its specific benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 60.92
  • HellaSwag (10-Shot): 80.13
  • MMLU (5-Shot): 49.51
  • TruthfulQA (0-shot): 65.55
  • Winogrande (5-shot): 75.85
  • GSM8k (5-shot): 15.39

These scores indicate its proficiency in common sense reasoning, factual recall, and language understanding, while also highlighting areas like mathematical reasoning (GSM8k) where there might be room for improvement.

Usage

The model supports standard text generation tasks and can be easily integrated using the Hugging Face transformers library. It is configured for bfloat16 dtype, optimizing for performance on compatible hardware.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p