allknowingroger/Qwenslerp4-14B
allknowingroger/Qwenslerp4-14B is a 14.8 billion parameter language model based on Qwen/Qwen2.5-14B, created by allknowingroger using the DARE TIES merge method. This model integrates several specialized Qwen2.5-14B variants to enhance performance across various reasoning and factual understanding tasks. It is specifically optimized for benchmarks like MATH, MUSR, GPQA, and IFEval, making it suitable for complex problem-solving and knowledge-intensive applications.
Loading preview...
Overview
allknowingroger/Qwenslerp4-14B is a 14.8 billion parameter language model built upon the Qwen/Qwen2.5-14B base, developed by allknowingroger. It leverages the DARE TIES merge method to combine four distinct Qwen2.5-14B variants: CultriX/Qwen2.5-14B-Wernicke, VAGOsolutions/SauerkrautLM-v2-14b-DPO, rombodawg/Rombos-LLM-V2.6-Qwen-14b, and allknowingroger/Qwenslerp2-14B. This strategic merge aims to consolidate and enhance specific strengths from each component model.
Key Capabilities
- Enhanced Reasoning: Prioritizes performance in reasoning-heavy tasks such as MATH and MUSR, with specific task weights applied during the merge.
- Factual Recall & Understanding: Boosts accuracy in GPQA and maintains consistent knowledge representation in MMLU-PRO.
- Instruction Following: Designed to maintain high IFEval performance, indicating strong adherence to instructions.
- Optimized for Efficiency: Utilizes
int8_maskandbfloat16dtype for memory and compute efficiency, alongsidenormalizeparameters for scale consistency.
Good for
- Applications requiring strong mathematical and logical reasoning.
- Tasks demanding high factual accuracy and general knowledge.
- Use cases where robust instruction following is critical.
- Developers seeking a merged model that balances conversational ability with specialized benchmark performance.