Artples/L-MChat-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 2, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Artples/L-MChat-7b is a 7 billion parameter language model created by Artples, formed by merging Nexusflow/Starling-LM-7B-beta and FuseAI/FuseChat-7B-VaRM. This model leverages a slerp merge method to combine the strengths of its base models, offering a balanced performance across various benchmarks. It is designed for general conversational AI tasks and demonstrates competitive results on the Open LLM Leaderboard.

Loading preview...

Overview

Artples/L-MChat-7b is a 7 billion parameter language model developed by Artples, created through a strategic merge of two prominent models: Nexusflow/Starling-LM-7B-beta and FuseAI/FuseChat-7B-VaRM. This model utilizes a slerp merge method, combining layers from both base models to achieve a balanced and robust performance profile. The merge configuration specifies distinct t values for self_attn and mlp layers, indicating a fine-tuned integration of their respective architectures.

Key Capabilities & Performance

L-MChat-7b demonstrates solid performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. Its average score is 69.57, with notable results including:

  • HellaSwag (10-Shot): 84.59
  • Winogrande (5-shot): 81.37
  • AI2 Reasoning Challenge (25-Shot): 65.61
  • MMLU (5-Shot): 65.44

Further evaluations on the Open LLM Leaderboard show an average of 21.02, with specific scores like:

  • IFEval (0-Shot): 52.97
  • BBH (3-Shot): 24.20

Usage & Licensing

Developers can easily integrate L-MChat-7b using the transformers library, with provided Python code for text generation. The model is released under the Apache 2.0 license, with a specific restriction against direct competition with OpenAI. The merge process itself was facilitated using LazyMergekit.