Overview
wanlige/li-14b-v0.4-slerp0.1 is a 14.8 billion parameter language model, created by wanlige through a SLERP (Spherical Linear Interpolation) merge. This model combines the capabilities of two distinct base models: wanlige/li-14b-v0.4 and sthenno-com/miscii-14b-0218, aiming to leverage their respective strengths for enhanced performance.
Merge Details
The model was constructed using the mergekit tool with a specific SLERP configuration. The merge involved applying different interpolation values across self-attention and MLP layers, as well as a layer-wise interpolation schedule, to blend the characteristics of the source models.
Key Capabilities
- General Language Understanding: Designed to handle a broad range of natural language processing tasks.
- Combined Strengths: Benefits from the merged architectures of
wanlige/li-14b-v0.4andsthenno-com/miscii-14b-0218. - 32K Context Length: Supports processing longer sequences of text, up to 32,768 tokens.
Open LLM Leaderboard Evaluation
This model has been evaluated on the Open LLM Leaderboard, achieving an average score of 42.91. Notable individual metric scores include:
- IFEval (0-Shot): 79.23
- BBH (3-Shot): 50.88
- MATH Lvl 5 (4-Shot): 53.32
- MMLU-PRO (5-shot): 47.71
Good for
- Developers seeking a merged model that combines the characteristics of its base components.
- Applications requiring a 14B parameter model with a substantial context window.
- Experimentation with SLERP-merged architectures for general-purpose LLM tasks.