CultriX/Qwen2.5-14B-Wernicke

TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Oct 21, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

CultriX/Qwen2.5-14B-Wernicke is a 14 billion parameter language model based on the Qwen2.5 architecture, created by CultriX through a Model Stock merge. This model integrates multiple specialized Qwen2.5-14B variants, including instruction-tuned and domain-specific versions, to enhance overall performance. It is designed for general language understanding and generation tasks, leveraging the combined strengths of its constituent models. The merge aims to provide a robust foundation for diverse applications requiring a capable 14B parameter model.

Loading preview...

CultriX/Qwen2.5-14B-Wernicke: Merged Language Model

CultriX/Qwen2.5-14B-Wernicke is a 14 billion parameter language model built upon the Qwen2.5-14B base. This model was created using the Model Stock merge method, a technique designed to combine the strengths of several pre-trained language models into a single, more capable entity. The merge process involved integrating five distinct Qwen2.5-14B variants, including instruction-tuned and specialized models.

Key Components Merged

This model incorporates the following Qwen2.5-14B based models:

  • v000000/Qwen2.5-Lumen-14B
  • arcee-ai/SuperNova-Medius
  • rombodawg/Rombos-LLM-V2.6-Qwen-14b
  • Qwen/Qwen2.5-14B-Instruct
  • EVA-UNIT-01/EVA-Qwen2.5-14B-v0.0

Performance Overview

Evaluations on the Open LLM Leaderboard indicate an average score of 36.60. Specific benchmark results include:

  • IFEval (0-Shot): 52.35
  • BBH (3-Shot): 50.64
  • MMLU-PRO (5-shot): 49.15

Use Cases

Given its foundation in the Qwen2.5 architecture and the strategic merging of diverse models, CultriX/Qwen2.5-14B-Wernicke is suitable for a broad range of general-purpose language tasks. Its instruction-tuned components suggest proficiency in following directives, while the inclusion of specialized models may enhance its capabilities in specific domains, making it a versatile choice for applications requiring a balanced 14B parameter model.