allknowingroger/QwenStock2-14B
allknowingroger/QwenStock2-14B is a 14.8 billion parameter language model created by allknowingroger using the Model Stock merge method. It is based on CultriX/SeQwence-14Bv1 and integrates several other Qwen2.5-14B variants. This model is designed for general language understanding and generation tasks, demonstrating an average score of 36.93 on the Open LLM Leaderboard, with notable performance in IFEval (55.63) and BBH (50.60).
Loading preview...
Overview
allknowingroger/QwenStock2-14B is a 14.8 billion parameter language model developed by allknowingroger. It was created using the Model Stock merge method, a technique described in the paper "Model Stock: A Method for Merging Pre-trained Language Models." The base model for this merge was CultriX/SeQwence-14Bv1, and it incorporates seven other Qwen2.5-14B variants from both allknowingroger and CultriX.
Key Capabilities
- Merged Architecture: Leverages the strengths of multiple Qwen2.5-14B models through the Model Stock merging technique.
- General Language Tasks: Suitable for a broad range of natural language understanding and generation applications.
Performance Highlights
Evaluated on the Open LLM Leaderboard, QwenStock2-14B achieved an average score of 36.93. Specific metric scores include:
- IFEval (0-Shot): 55.63
- BBH (3-Shot): 50.60
- MMLU-PRO (5-shot): 48.95
Good for
- Developers looking for a robust 14B parameter model for general-purpose language tasks.
- Experimentation with models created using advanced merging techniques like Model Stock.
- Applications requiring solid performance across various benchmarks, particularly in instruction following and complex reasoning.