Overview
Artples/L-MChat-7b is a 7 billion parameter language model developed by Artples, created through a strategic merge of two prominent models: Nexusflow/Starling-LM-7B-beta and FuseAI/FuseChat-7B-VaRM. This model utilizes a slerp merge method, combining layers from both base models to achieve a balanced and robust performance profile. The merge configuration specifies distinct t values for self_attn and mlp layers, indicating a fine-tuned integration of their respective architectures.
Key Capabilities & Performance
L-MChat-7b demonstrates solid performance across a range of benchmarks, as evaluated on the Open LLM Leaderboard. Its average score is 69.57, with notable results including:
- HellaSwag (10-Shot): 84.59
- Winogrande (5-shot): 81.37
- AI2 Reasoning Challenge (25-Shot): 65.61
- MMLU (5-Shot): 65.44
Further evaluations on the Open LLM Leaderboard show an average of 21.02, with specific scores like:
- IFEval (0-Shot): 52.97
- BBH (3-Shot): 24.20
Usage & Licensing
Developers can easily integrate L-MChat-7b using the transformers library, with provided Python code for text generation. The model is released under the Apache 2.0 license, with a specific restriction against direct competition with OpenAI. The merge process itself was facilitated using LazyMergekit.