Sakalti/Saka-14B

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Feb 5, 2025Architecture:Transformer0.0K Cold

Saka-14B is a 14.8 billion parameter language model created by Sakalti, developed using the TIES merge method. It is based on sometimesanotion/Qwenvergence-14B-v11 and incorporates sometimesanotion/Lamarck-14B-v0.7. This model is a merged architecture, designed to combine the strengths of its constituent models, and features a notable context length of 131072 tokens.

Loading preview...

Model Overview

Saka-14B is a 14.8 billion parameter language model developed by Sakalti. This model was created using the TIES (Trimmed-mean-based Information Exchange Strategy) merge method, which combines multiple pre-trained language models into a single, more capable model.

Merge Details

The base model for Saka-14B is sometimesanotion/Qwenvergence-14B-v11. It integrates sometimesanotion/Lamarck-14B-v0.7 as a contributing model. The TIES method, as described in the original paper, is designed to efficiently merge models by identifying and combining their most salient features.

Key Characteristics

  • Parameter Count: 14.8 billion parameters.
  • Merge Method: Utilizes the TIES merging technique for combining model weights.
  • Base Model: Built upon the Qwenvergence-14B-v11 architecture.
  • Context Length: Features a substantial context window of 131072 tokens, allowing for processing of extensive inputs.

Potential Use Cases

Given its merged nature and large context window, Saka-14B is suitable for applications requiring:

  • Processing and understanding long documents or conversations.
  • Tasks that benefit from the combined knowledge and capabilities of its constituent models.
  • Exploration of merged model performance in various NLP benchmarks.