luqmanxyz/LelaStarling-7B

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jan 20, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

LelaStarling-7B is a 7 billion parameter language model created by luqmanxyz, formed by merging SanjiWatsuki/Lelantos-DPO-7B and berkeley-nest/Starling-LM-7B-alpha using a slerp merge method. This model is designed for general text generation tasks, leveraging the strengths of its constituent models. It achieves an average score of 71.45 on the Open LLM Leaderboard, with notable performance in reasoning and common sense benchmarks.

Loading preview...

LelaStarling-7B Overview

LelaStarling-7B is a 7 billion parameter language model developed by luqmanxyz, created through a strategic merge of two distinct models: SanjiWatsuki/Lelantos-DPO-7B and berkeley-nest/Starling-LM-7B-alpha. This merge was executed using the slerp (spherical linear interpolation) method via LazyMergekit, combining the strengths of both base models.

Key Capabilities & Performance

This model demonstrates solid performance across various benchmarks, as evaluated on the Open LLM Leaderboard. Its average score is 71.45, indicating a balanced capability for general language tasks. Specific benchmark results include:

  • AI2 Reasoning Challenge (25-Shot): 67.58
  • HellaSwag (10-Shot): 86.33
  • MMLU (5-Shot): 64.98
  • TruthfulQA (0-shot): 57.73
  • Winogrande (5-shot): 80.98
  • GSM8k (5-shot): 71.11

These scores suggest proficiency in common sense reasoning, factual recall, and mathematical problem-solving, making it suitable for a range of applications requiring robust language understanding and generation.

When to Use This Model

LelaStarling-7B is a strong candidate for use cases requiring a capable 7B parameter model with a balanced performance profile. Its merged architecture aims to provide a versatile foundation for tasks such as:

  • General text generation and completion
  • Question answering
  • Reasoning-based tasks
  • Applications benefiting from a model with good common sense understanding