cookinai/CM-14: A Slerp Merged 7B Model
CM-14 is a 7 billion parameter language model developed by cookinai, created through a Slerp merge operation. This model combines the characteristics of two distinct base models:
- cookinai/CatMacaroni-Slerp
- EmbeddedLLM/Mistral-7B-Merge-14-v0.2
The merge process, defined by a specific .yaml configuration, applies varying interpolation values across different layers and tensor types. For instance, self_attn layers use a range of t values from 0 to 1, while mlp layers use values from 1 to 0, with a fallback of 0.5 for other tensors. This fine-grained control over the merge parameters aims to optimize the combined model's performance.
Key Characteristics
- Architecture: Based on the Mistral 7B family, inheriting its efficient design.
- Parameter Count: 7 billion parameters, offering a balance between performance and computational requirements.
- Context Length: Supports a 4096-token context window, suitable for various conversational and text generation tasks.
- Merge Method: Utilizes the Slerp (Spherical Linear Interpolation) merge method, known for effectively blending model weights.
Potential Use Cases
Given its 7B parameter size and merged lineage, CM-14 is likely well-suited for:
- General-purpose text generation and completion.
- Chatbot applications requiring moderate context.
- Experimentation with merged model architectures.
- Tasks where a balance of performance and resource efficiency is desired.