Overview
MaziyarPanahi/calme-2.8-qwen2-7b is a fine-tuned iteration of the Qwen/Qwen2-7B model, developed by Maziyar Panahi. This 7.6 billion parameter model is designed to improve upon the base model's performance across a range of benchmarks, leveraging a substantial 131,072 token context window. It utilizes the ChatML prompt template for structured interactions.
Key Capabilities
- Enhanced General Performance: Aims to improve the base Qwen2-7B model's capabilities across various language tasks.
- Large Context Window: Supports a context length of 131,072 tokens, enabling processing of extensive inputs.
- ChatML Prompting: Designed to work with the ChatML format for clear system, user, and assistant turns.
- Quantized GGUF Versions: Available in optimized GGUF formats for efficient deployment and inference on various hardware.
Benchmark Performance
Evaluations on the Open LLM Leaderboard show an average score of 19.22. Specific metrics include:
- IFEval (0-Shot): 27.75
- BBH (3-Shot): 25.53
- MATH Lvl 5 (4-Shot): 15.63
- GPQA (0-shot): 5.82
- MuSR (0-shot): 12.06
- MMLU-PRO (5-shot): 28.51
Good for
- Developers seeking a fine-tuned Qwen2-7B variant with improved general performance.
- Applications requiring a model with a very large context window.
- Use cases where ChatML prompting is preferred for structured conversational AI.