Overview
L-MChat-Small: A Compact Merged Language Model
L-MChat-Small is a 3 billion parameter language model developed by Artples, created to investigate the performance potential of smaller, merged architectures. Unlike many larger models, this model focuses on efficiency while maintaining utility for conversational tasks.
Key Capabilities & Features
- Architecture: A merged model utilizing the SLERP method, combining
rhysjones/phi-2-orange-v2andWeyaxi/Einstein-v4-phi2. - Parameter Count: 3 billion parameters, offering a more compact footprint compared to larger models.
- Context Length: Supports a 2048-token context window.
- Performance: Achieves an average score of 63.14 on the Open LLM Leaderboard, with specific scores including 61.60 on AI2 Reasoning Challenge and 75.90 on HellaSwag.
Use Cases & Strengths
- General Chat Applications: Optimized for conversational interactions using the ChatML format.
- Resource-Constrained Environments: Its smaller size makes it suitable for deployment where computational resources are limited.
- Exploration of Merge Methods: Demonstrates the effectiveness of the SLERP merge method for creating capable models from existing smaller bases.