allknowingroger/QwenStock3-14B
TEXT GENERATIONConcurrency Cost:1Model Size:14.8BQuant:FP8Ctx Length:32kPublished:Nov 29, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

allknowingroger/QwenStock3-14B is a 14.8 billion parameter language model created by allknowingroger using the Model Stock merge method. It is based on CultriX/SeQwence-14Bv1 and integrates multiple Qwen2.5-14B variants. This model is designed for general language tasks, demonstrating an average performance of 36.97 on the Open LLM Leaderboard, with specific scores for tasks like IFEval and MMLU-PRO. Its development focuses on combining strengths from various pre-trained models to enhance overall capability.

Loading preview...

Model Overview

allknowingroger/QwenStock3-14B is a 14.8 billion parameter language model developed by allknowingroger. This model was created using the Model Stock merge method, which combines the strengths of several pre-trained models into a single, more capable entity. The base model for this merge was CultriX/SeQwence-14Bv1.

Merge Details

This model integrates a diverse set of Qwen2.5-14B variants and other related models, including:

Performance Highlights

Evaluations on the Open LLM Leaderboard show an average score of 36.97. Key performance metrics include:

  • IFEval (0-Shot): 56.15
  • BBH (3-Shot): 50.58
  • MMLU-PRO (5-shot): 49.20

These results indicate its capabilities across various reasoning and knowledge-based tasks. The model's 32768 token context length supports processing longer inputs and generating more extensive outputs.

Use Cases

Given its merged architecture and performance, QwenStock3-14B is suitable for general-purpose language generation, question answering, and tasks requiring a broad understanding of various domains. Its design aims to leverage the collective intelligence of its constituent models.