Overview
Model Overview
Saka-14B is a 14.8 billion parameter language model developed by Sakalti. This model was created using the TIES (Trimmed-mean-based Information Exchange Strategy) merge method, which combines multiple pre-trained language models into a single, more capable model.
Merge Details
The base model for Saka-14B is sometimesanotion/Qwenvergence-14B-v11. It integrates sometimesanotion/Lamarck-14B-v0.7 as a contributing model. The TIES method, as described in the original paper, is designed to efficiently merge models by identifying and combining their most salient features.
Key Characteristics
- Parameter Count: 14.8 billion parameters.
- Merge Method: Utilizes the TIES merging technique for combining model weights.
- Base Model: Built upon the Qwenvergence-14B-v11 architecture.
- Context Length: Features a substantial context window of 131072 tokens, allowing for processing of extensive inputs.
Potential Use Cases
Given its merged nature and large context window, Saka-14B is suitable for applications requiring:
- Processing and understanding long documents or conversations.
- Tasks that benefit from the combined knowledge and capabilities of its constituent models.
- Exploration of merged model performance in various NLP benchmarks.