MaziyarPanahi/calme-2.3-qwen2-7b

Warm
Public
7.6B
FP8
32768
1
License: apache-2.0
Hugging Face

MaziyarPanahi/calme-2.3-qwen2-7b is a 7.6 billion parameter causal language model, fine-tuned by MaziyarPanahi from the Qwen/Qwen2-7B architecture. This model aims to enhance the base Qwen2-7B performance across various benchmarks, demonstrating an average score of 22.74 on the Open LLM Leaderboard. With a substantial context length of 131072 tokens, it is optimized for general-purpose language tasks and improved reasoning capabilities.

Overview

Model Overview

MaziyarPanahi/calme-2.3-qwen2-7b is a 7.6 billion parameter language model developed by MaziyarPanahi. It is a fine-tuned iteration of the Qwen/Qwen2-7B base model, specifically designed to achieve improved performance across a range of benchmarks.

Key Capabilities & Performance

This model demonstrates enhanced capabilities as reflected in its Open LLM Leaderboard evaluation results. Key performance metrics include:

  • Average Score: 22.74
  • IFEval (0-Shot): 38.25
  • BBH (3-Shot): 30.96
  • MATH Lvl 5 (4-Shot): 18.66
  • MMLU-PRO (5-shot): 29.01

The model utilizes the ChatML prompt template, making it compatible with standard instruction-following formats. It also supports a large context window of 131072 tokens, allowing for processing extensive inputs.

Usage Considerations

  • Fine-tuned for general improvement: This model is intended for users seeking a more capable version of the Qwen2-7B base model for various language generation and understanding tasks.
  • Quantized versions available: For users requiring optimized inference, quantized GGUF models are provided at MaziyarPanahi/calme-2.3-qwen2-7b-GGUF.
  • Prompt format: Adheres to the ChatML format for structured conversations.