allknowingroger/QwenStock3-14B is a 14.8 billion parameter language model created by allknowingroger using the Model Stock merge method. It is based on CultriX/SeQwence-14Bv1 and integrates multiple Qwen2.5-14B variants. This model is designed for general language tasks, demonstrating an average performance of 36.97 on the Open LLM Leaderboard, with specific scores for tasks like IFEval and MMLU-PRO. Its development focuses on combining strengths from various pre-trained models to enhance overall capability.
Loading preview...
Model Overview
allknowingroger/QwenStock3-14B is a 14.8 billion parameter language model developed by allknowingroger. This model was created using the Model Stock merge method, which combines the strengths of several pre-trained models into a single, more capable entity. The base model for this merge was CultriX/SeQwence-14Bv1.
Merge Details
This model integrates a diverse set of Qwen2.5-14B variants and other related models, including:
- CultriX/Qwen2.5-14B-MergeStock
- allknowingroger/QwenStock1-14B
- CultriX/Qwen2.5-14B-Wernicke
- allknowingroger/QwenStock2-14B
- allknowingroger/Qwenslerp2-14B
- CultriX/Qwen2.5-14B-MegaMerge-pt2
- CultriX/Qwestion-14B
Performance Highlights
Evaluations on the Open LLM Leaderboard show an average score of 36.97. Key performance metrics include:
- IFEval (0-Shot): 56.15
- BBH (3-Shot): 50.58
- MMLU-PRO (5-shot): 49.20
These results indicate its capabilities across various reasoning and knowledge-based tasks. The model's 32768 token context length supports processing longer inputs and generating more extensive outputs.
Use Cases
Given its merged architecture and performance, QwenStock3-14B is suitable for general-purpose language generation, question answering, and tasks requiring a broad understanding of various domains. Its design aims to leverage the collective intelligence of its constituent models.