jaspionjader/Kosmos-EVAA-Franken-stock-v43-8B

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 23, 2025Architecture:Transformer0.0K Cold

The jaspionjader/Kosmos-EVAA-Franken-stock-v43-8B is an 8 billion parameter language model created by jaspionjader using the Model Stock merge method. This model combines jaspionjader/ek-5 and jaspionjader/ek-6, based on jaspionjader/ek-3, and supports a 32768 token context length. It is designed as a merged model, leveraging the strengths of its constituent components for general language tasks.

Loading preview...

Overview

This model, jaspionjader/Kosmos-EVAA-Franken-stock-v43-8B, is an 8 billion parameter language model developed by jaspionjader. It was created using the Model Stock merge method, a technique described in the paper "Model Stock", which combines multiple pre-trained language models.

Merge Details

The model's architecture is based on jaspionjader/ek-3 as the foundational base model. It integrates two distinct models:

  • jaspionjader/ek-5
  • jaspionjader/ek-6

This merging process aims to synthesize the capabilities of its constituent models into a single, more robust offering. The merge was performed using mergekit and configured to use bfloat16 data type, indicating a focus on efficient performance while maintaining precision.

Key Characteristics

  • Parameter Count: 8 billion parameters.
  • Context Length: Supports a substantial context window of 32768 tokens.
  • Merge Method: Utilizes the Model Stock method for combining models.

Intended Use

This model is suitable for applications requiring a capable language model that benefits from the combined strengths of its merged components. Its large context window makes it particularly useful for tasks involving extensive text analysis or generation.