Sakalti/Saka-14B

Warm
Public
14.8B
FP8
131072
Feb 5, 2025
Hugging Face
Overview

Model Overview

Saka-14B is a 14.8 billion parameter language model developed by Sakalti. This model was created using the TIES (Trimmed-mean-based Information Exchange Strategy) merge method, which combines multiple pre-trained language models into a single, more capable model.

Merge Details

The base model for Saka-14B is sometimesanotion/Qwenvergence-14B-v11. It integrates sometimesanotion/Lamarck-14B-v0.7 as a contributing model. The TIES method, as described in the original paper, is designed to efficiently merge models by identifying and combining their most salient features.

Key Characteristics

  • Parameter Count: 14.8 billion parameters.
  • Merge Method: Utilizes the TIES merging technique for combining model weights.
  • Base Model: Built upon the Qwenvergence-14B-v11 architecture.
  • Context Length: Features a substantial context window of 131072 tokens, allowing for processing of extensive inputs.

Potential Use Cases

Given its merged nature and large context window, Saka-14B is suitable for applications requiring:

  • Processing and understanding long documents or conversations.
  • Tasks that benefit from the combined knowledge and capabilities of its constituent models.
  • Exploration of merged model performance in various NLP benchmarks.