KaraKaraWitch/spiral-da-HYAH-Qwen2.5-72b

Warm
Public
72.7B
FP8
131072
Hugging Face
Overview

Model Overview

KaraKaraWitch/spiral-da-HYAH-Qwen2.5-72b is a 72.7 billion parameter language model developed by KaraKaraWitch. This model was created using the Model Stock merge method, a technique designed to combine the strengths of multiple pre-existing models into a single, more capable entity. The base model for this merge was rombodawg/Rombos-LLM-V2.5-Qwen-72b.

Merge Details

This model integrates components from two additional models to enhance its overall performance and capabilities:

The Model Stock method, as described in the paper "Model Stock: A Method for Merging Large Language Models", was applied with specific configurations, including normalization and bfloat16 data type, to produce this merged model. The intention behind this merge was to create a robust model for general language generation and understanding tasks by combining diverse strengths from its constituent models.