marcuscedricridia/Cheng-2-v1.1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Mar 12, 2025Architecture:Transformer0.0K Warm

Cheng-2-v1.1 is a 14.8 billion parameter language model created by marcuscedricridia, merged using the Model Stock method with YOYO-AI/Qwen2.5-14B-it-restore as its base. This model integrates components from several specialized models, including Lunzima/NQLSG-Qwen2.5-14B-MegaFusion-v9.1 and arcee-ai/Virtuoso-Small-v2, to enhance its overall capabilities. With a context length of 32768 tokens, it is designed for broad applications benefiting from a blend of diverse model strengths.

Loading preview...

Cheng-2-v1.1: A Merged Language Model

Cheng-2-v1.1 is a 14.8 billion parameter language model developed by marcuscedricridia, built upon the YOYO-AI/Qwen2.5-14B-it-restore base model. This model was created using the advanced Model Stock merging technique, which combines the strengths of multiple pre-trained language models.

Key Merge Details

The model integrates several distinct components to achieve its capabilities:

  • Base Model: YOYO-AI/Qwen2.5-14B-it-restore served as the foundational architecture.
  • Merged Ingredients: It incorporates contributions from:
    • Lunzima/NQLSG-Qwen2.5-14B-MegaFusion-v9.1
    • arcee-ai/Virtuoso-Small-v2
    • Multiple proprietary 'Cheng-2-Ingredient' models from marcuscedricridia.

This merging strategy aims to leverage the specialized knowledge and performance characteristics of each constituent model, resulting in a versatile language model with a 32768-token context length. The merge process utilized bfloat16 for dtype and included int8_mask and normalize configurations.