jaspionjader/Kosmos-EVAA-Franken-v36-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Jan 2, 2025Architecture:Transformer0.0K Warm

jaspionjader/Kosmos-EVAA-Franken-v36-8B is an 8 billion parameter language model created by jaspionjader, developed through a SLERP merge of jaspionjader/f-8-8b and jaspionjader/f-5-8b. This model leverages a specific merging configuration to combine the strengths of its constituent models. It is designed for general language understanding and generation tasks, offering a balanced performance profile derived from its merged architecture.

Loading preview...

Model Overview

jaspionjader/Kosmos-EVAA-Franken-v36-8B is an 8 billion parameter language model developed by jaspionjader. This model was constructed using the SLERP merge method via mergekit, combining two distinct base models: jaspionjader/f-8-8b and jaspionjader/f-5-8b.

Merge Details

The merging process involved specific layer ranges from both source models (layers 0-32) and a detailed parameter configuration. This configuration applied varying t values to self-attention (self_attn) and multi-layer perceptron (mlp) components, indicating a fine-tuned approach to integrate the characteristics of the merged models. The base model for the merge was jaspionjader/f-5-8b, and the process utilized bfloat16 for numerical precision.

Key Characteristics

  • Architecture: Merged model derived from two 8B parameter models.
  • Merge Method: Utilizes the SLERP (Spherical Linear Interpolation) method for combining model weights.
  • Custom Configuration: Features a specific YAML configuration for parameter blending, allowing for nuanced integration of source model capabilities.

Intended Use

This model is suitable for general-purpose language tasks, benefiting from the combined knowledge and capabilities of its constituent models. Its merged nature suggests a balanced performance across various applications, making it a versatile choice for developers seeking a robust 8B parameter model.