MaziyarPanahi/calme-2.1-qwen2-7b

Warm
Public
7.6B
FP8
32768
1
License: apache-2.0
Hugging Face

MaziyarPanahi/calme-2.1-qwen2-7b is a 7.6 billion parameter language model fine-tuned by Maziyar Panahi, based on the Qwen/Qwen2-7B architecture. This model aims to enhance the base model's performance across various benchmarks, offering improved general capabilities. It is designed for broad applications requiring a robust, fine-tuned LLM with a substantial 131,072 token context length.

Overview

Overview

MaziyarPanahi/calme-2.1-qwen2-7b is a 7.6 billion parameter language model developed by Maziyar Panahi. It is a fine-tuned iteration of the Qwen/Qwen2-7B base model, specifically engineered to deliver improved performance across a range of benchmarks. The model supports a significant context length of 131,072 tokens, making it suitable for processing extensive inputs.

Key Capabilities & Performance

This model demonstrates enhanced general capabilities compared to its base model. Its performance has been evaluated on the Open LLM Leaderboard, showing an average score of 23.20. Specific benchmark results include:

  • IFEval (0-Shot): 38.16
  • BBH (3-Shot): 31.01
  • MATH Lvl 5 (4-Shot): 21.07
  • GPQA (0-shot): 5.26
  • MuSR (0-shot): 13.80
  • MMLU-PRO (5-shot): 29.92

Prompt Template

The model utilizes the ChatML prompt template, which is a common and flexible format for conversational AI. This template structures interactions with system, user, and assistant roles, facilitating clear communication.

Usage

Users can interact with this model using the transformers library, either through a high-level pipeline for quick setup or by directly loading the tokenizer and model for more customized applications. Quantized GGUF versions are also available for efficient deployment.