Gille/StrangeMerges_51-7B-dare_ties

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Apr 1, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Gille/StrangeMerges_51-7B-dare_ties is a 7 billion parameter language model created by Gille, formed by merging several specialized models including WizardMath-7B-V1.1, NeuralCoder-7b, and Einstein-v4-7B using the dare_ties method. This merge aims to combine strengths in mathematical reasoning, code generation, and general language understanding. It is designed for diverse applications requiring a blend of these capabilities, offering a context length of 8192 tokens.

Loading preview...

Model Overview

Gille/StrangeMerges_51-7B-dare_ties is a 7 billion parameter language model developed by Gille. This model is a sophisticated merge of five distinct base models, utilizing the dare_ties merging method via LazyMergekit. The constituent models include:

  • WizardLM/WizardMath-7B-V1.1: Contributes strong mathematical reasoning abilities.
  • Kukedlc/NeuralCoder-7b: Enhances code generation and understanding.
  • Weyaxi/Einstein-v4-7B: Provides general language and reasoning capabilities.
  • 0-hero/Matter-0.1-Slim-7B-C-DPO: Likely contributes to instruction following and alignment.
  • Gille/StrangeMerges_42-7B-dare_ties: An earlier merge from the same creator, suggesting iterative improvement.

Key Capabilities

  • Blended Expertise: Combines specialized strengths in mathematics, coding, and general reasoning from its merged components.
  • Instruction Following: Benefits from DPO-tuned components, improving response quality and alignment.
  • Performance: Achieves an average score of 71.73 on the Open LLM Leaderboard, with notable scores in:
    • HellaSwag (10-Shot): 85.90
    • Winogrande (5-shot): 82.08
    • GSM8k (5-shot): 70.13 (indicating strong mathematical problem-solving)

Good For

  • Applications requiring a balance of mathematical problem-solving and code generation.
  • General-purpose tasks where robust reasoning and instruction following are important.
  • Developers looking for a 7B model with a broad range of capabilities derived from specialized merges.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p