marcuscedricridia/Hush-Qwen2.5-7B-MST-v1.3

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 9, 2025Architecture:Transformer0.0K Warm

marcuscedricridia/Hush-Qwen2.5-7B-MST-v1.3 is a 7.6 billion parameter language model based on the Qwen2.5-7B architecture, created by marcuscedricridia. This model was developed using the Model Stock merge method, combining several specialized Qwen2.5-7B variants. It is designed to leverage the strengths of its constituent models, offering a versatile base for various natural language processing tasks with a 32K context length.

Loading preview...

Model Overview

marcuscedricridia/Hush-Qwen2.5-7B-MST-v1.3 is a 7.6 billion parameter language model built upon the Qwen2.5-7B base architecture. Developed by marcuscedricridia, this model utilizes the Model Stock merge method to combine the capabilities of multiple pre-trained Qwen2.5-7B variants.

Merge Details

This model is a merge of the following specialized Qwen2.5-7B models:

  • Etherll/Qwen2.5-7B-della-test
  • marcuscedricridia/Hush-Qwen2.5-7B-Preview
  • marcuscedricridia/absolute-o1-7b
  • marcuscedricridia/sbr-o1-7b
  • marcuscedricridia/Hush-Qwen2.5-7B-RP-v1.1-1M

The merge process used bfloat16 for data types, with int8_mask enabled and normalization applied. The tokenizer source was inherited from the base model. This approach aims to consolidate diverse strengths from its constituent models into a single, more robust offering.

Key Characteristics

  • Architecture: Qwen2.5-7B base
  • Parameter Count: 7.6 billion
  • Context Length: 32,768 tokens
  • Development Method: Model Stock merge, integrating multiple fine-tuned models.

Potential Use Cases

Given its merged nature, this model is likely suitable for a broad range of applications, potentially excelling in areas where its constituent models showed strength, such as:

  • General text generation and understanding
  • Role-playing scenarios (due to Hush-Qwen2.5-7B-RP-v1.1-1M)
  • Tasks requiring robust language capabilities from its diverse training base.