Gille/StrangeMerges_15-7B-slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 31, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Gille/StrangeMerges_15-7B-slerp is a 7 billion parameter language model created by Gille, resulting from a slerp merge of Gille/StrangeMerges_14-7B-slerp and CultriX/Wernicke-7B-v9. This model is designed for general text generation tasks, leveraging its merged architecture to achieve a balanced performance across various benchmarks. It features a 4096-token context length and demonstrates an average score of 72.41 on the Open LLM Leaderboard, indicating solid reasoning and language understanding capabilities.

Loading preview...

Model Overview

Gille/StrangeMerges_15-7B-slerp is a 7 billion parameter language model developed by Gille. It is a product of a slerp merge (spherical linear interpolation) combining two distinct models: Gille/StrangeMerges_14-7B-slerp and CultriX/Wernicke-7B-v9. This merging technique aims to blend the strengths of its constituent models, offering a versatile solution for various natural language processing tasks.

Key Capabilities & Performance

This model demonstrates strong performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. Key scores include:

  • Average Score: 72.41
  • AI2 Reasoning Challenge (25-Shot): 68.00
  • HellaSwag (10-Shot): 86.82
  • MMLU (5-Shot): 65.58
  • TruthfulQA (0-shot): 59.99
  • Winogrande (5-shot): 82.56
  • GSM8k (5-shot): 71.49

These results indicate a balanced capability in reasoning, common sense, language understanding, and mathematical problem-solving. The model supports a context length of 4096 tokens.

Usage

Developers can easily integrate StrangeMerges_15-7B-slerp into their projects using the transformers library, with provided Python code examples for text generation. The model is configured to use bfloat16 for efficient computation.