Overview
Undi95/ReMM-SLERP-L2-13B: A Recreated and Merged 13B Model
Undi95/ReMM-SLERP-L2-13B is a 13 billion parameter language model that serves as a recreation of the original MythoMax-L2-13b, incorporating updated base models and utilizing the SLERP merging method. This model is built upon the Llama-2-13B architecture and integrates components from several well-regarded models to enhance its capabilities.
Key Characteristics
- Merged Architecture: Created by merging The-Face-Of-Goonery/Chronos-Beluga-v2-13bfp16, jondurbin/airoboros-l2-13b-2.1, NousResearch/Nous-Hermes-Llama2-13b, and The-Face-Of-Goonery/Huginn-13b-v1.2.
- SLERP Merging: Employs the SLERP (Spherical Linear Interpolation) merging technique, specifically a version adapted for notebook usage, to combine the strengths of its constituent models.
- Context Length: Supports a context length of 4096 tokens.
- Instruction Following: Uses the Alpaca prompt template, indicating an optimization for instruction-based tasks.
Performance Benchmarks
Evaluated on the Open LLM Leaderboard, ReMM-SLERP-L2-13B achieved an average score of 50.99. Notable individual scores include:
- ARC (25-shot): 60.92
- HellaSwag (10-shot): 83.56
- MMLU (5-shot): 55.33
- TruthfulQA (0-shot): 51.97
- Winogrande (5-shot): 75.22
Use Cases
This model is suitable for applications requiring robust instruction following and general conversational abilities, benefiting from the combined knowledge and reasoning capabilities of its merged components.