bunnycore/Qwen-2.5-7b-S1k

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Feb 19, 2025Architecture:Transformer0.0K Cold

The bunnycore/Qwen-2.5-7b-S1k model is a 7.6 billion parameter language model based on the Qwen 2.5 architecture, fine-tuned from bunnycore/Qwen-2.5-7B-Deep-Stock-v4. It features a 32768 token context length and is specifically designed for complex reasoning tasks, as indicated by its strong performance on MATH Lvl 5 and IFEval benchmarks. This model is suitable for applications requiring advanced problem-solving and instruction following capabilities.

Loading preview...

Overview

The bunnycore/Qwen-2.5-7b-S1k is a 7.6 billion parameter language model built upon the Qwen 2.5 architecture. It is a merged model, combining bunnycore/Qwen-2.5-7B-Deep-Stock-v4 with a LoRA model, and utilizes bfloat16 for its dtype. The model is configured with a tokenizer sourced from bunnycore/Qwen-2.5-7B-Deep-Stock-v4.

Key Capabilities & Performance

Evaluated on the Open LLM Leaderboard, this model demonstrates a focus on instruction following and mathematical reasoning. Notable benchmark results include:

  • IFEval (0-Shot): 71.62
  • MATH Lvl 5 (4-Shot): 47.81
  • BBH (3-Shot): 36.69
  • MMLU-PRO (5-shot): 37.58

These scores suggest a strong capability in understanding and executing complex instructions, as well as proficiency in advanced mathematical problem-solving.

Good For

  • Applications requiring robust instruction following.
  • Tasks involving mathematical reasoning and problem-solving at an advanced level.
  • Use cases where a 7.6B parameter model with a 32K context window is suitable for balancing performance and resource efficiency.