KaraKaraWitch/spiral-da-HYAH-Qwen2.5-72b

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:72.7BQuant:FP8Ctx Length:32kPublished:Oct 21, 2024Architecture:Transformer0.0K Warm

KaraKaraWitch/spiral-da-HYAH-Qwen2.5-72b is a 72.7 billion parameter language model created by KaraKaraWitch, built using the Model Stock merge method. It leverages rombodawg/Rombos-LLM-V2.5-Qwen-72b as its base, integrating capabilities from anthracite-org/magnum-v4-72b and AXCXEPT/EZO-Qwen2.5-72B-Instruct. This model is designed for general language tasks, benefiting from the combined strengths of its constituent models.

Loading preview...

Model Overview

KaraKaraWitch/spiral-da-HYAH-Qwen2.5-72b is a 72.7 billion parameter language model developed by KaraKaraWitch. This model was created using the Model Stock merge method, a technique designed to combine the strengths of multiple pre-existing models into a single, more capable entity. The base model for this merge was rombodawg/Rombos-LLM-V2.5-Qwen-72b.

Merge Details

This model integrates components from two additional models to enhance its overall performance and capabilities:

The Model Stock method, as described in the paper "Model Stock: A Method for Merging Large Language Models", was applied with specific configurations, including normalization and bfloat16 data type, to produce this merged model. The intention behind this merge was to create a robust model for general language generation and understanding tasks by combining diverse strengths from its constituent models.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p