bunnycore/Qwen-2.5-7b-S1k
The bunnycore/Qwen-2.5-7b-S1k model is a 7.6 billion parameter language model based on the Qwen 2.5 architecture, fine-tuned from bunnycore/Qwen-2.5-7B-Deep-Stock-v4. It features a 32768 token context length and is specifically designed for complex reasoning tasks, as indicated by its strong performance on MATH Lvl 5 and IFEval benchmarks. This model is suitable for applications requiring advanced problem-solving and instruction following capabilities.
Loading preview...
Overview
The bunnycore/Qwen-2.5-7b-S1k is a 7.6 billion parameter language model built upon the Qwen 2.5 architecture. It is a merged model, combining bunnycore/Qwen-2.5-7B-Deep-Stock-v4 with a LoRA model, and utilizes bfloat16 for its dtype. The model is configured with a tokenizer sourced from bunnycore/Qwen-2.5-7B-Deep-Stock-v4.
Key Capabilities & Performance
Evaluated on the Open LLM Leaderboard, this model demonstrates a focus on instruction following and mathematical reasoning. Notable benchmark results include:
- IFEval (0-Shot): 71.62
- MATH Lvl 5 (4-Shot): 47.81
- BBH (3-Shot): 36.69
- MMLU-PRO (5-shot): 37.58
These scores suggest a strong capability in understanding and executing complex instructions, as well as proficiency in advanced mathematical problem-solving.
Good For
- Applications requiring robust instruction following.
- Tasks involving mathematical reasoning and problem-solving at an advanced level.
- Use cases where a 7.6B parameter model with a 32K context window is suitable for balancing performance and resource efficiency.