jaspionjader/Kosmos-EVAA-Franken-stock-v43-8B
The jaspionjader/Kosmos-EVAA-Franken-stock-v43-8B is an 8 billion parameter language model created by jaspionjader using the Model Stock merge method. This model combines jaspionjader/ek-5 and jaspionjader/ek-6, based on jaspionjader/ek-3, and supports a 32768 token context length. It is designed as a merged model, leveraging the strengths of its constituent components for general language tasks.
Loading preview...
Overview
This model, jaspionjader/Kosmos-EVAA-Franken-stock-v43-8B, is an 8 billion parameter language model developed by jaspionjader. It was created using the Model Stock merge method, a technique described in the paper "Model Stock", which combines multiple pre-trained language models.
Merge Details
The model's architecture is based on jaspionjader/ek-3 as the foundational base model. It integrates two distinct models:
jaspionjader/ek-5jaspionjader/ek-6
This merging process aims to synthesize the capabilities of its constituent models into a single, more robust offering. The merge was performed using mergekit and configured to use bfloat16 data type, indicating a focus on efficient performance while maintaining precision.
Key Characteristics
- Parameter Count: 8 billion parameters.
- Context Length: Supports a substantial context window of 32768 tokens.
- Merge Method: Utilizes the Model Stock method for combining models.
Intended Use
This model is suitable for applications requiring a capable language model that benefits from the combined strengths of its merged components. Its large context window makes it particularly useful for tasks involving extensive text analysis or generation.