MaziyarPanahi/calme-2.4-qwen2-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Warm

MaziyarPanahi/calme-2.4-qwen2-7b is a 7.6 billion parameter language model fine-tuned by MaziyarPanahi based on the Qwen2-7B architecture. This model aims to improve the base Qwen2-7B model's performance across various benchmarks, as indicated by its Open LLM Leaderboard evaluation results. It is designed for general language generation tasks, leveraging its fine-tuned capabilities for enhanced performance.

Loading preview...

Model Overview

MaziyarPanahi/calme-2.4-qwen2-7b is a fine-tuned version of the Qwen/Qwen2-7B base model, developed by MaziyarPanahi. This 7.6 billion parameter model focuses on enhancing the base model's performance across a range of benchmarks, as evidenced by its evaluation on the Open LLM Leaderboard.

Key Capabilities and Performance

The model's performance is highlighted by its scores on the Open LLM Leaderboard:

  • Average Score: 22.52
  • IFEval (0-Shot): 33.00
  • BBH (3-Shot): 31.82
  • MATH Lvl 5 (4-Shot): 18.35
  • GPQA (0-shot): 4.47
  • MuSR (0-shot): 14.43
  • MMLU-PRO (5-shot): 33.08

These metrics suggest a balanced performance across different reasoning, knowledge, and instruction-following tasks, with a notable score in IFEval.

Technical Details

  • Base Model: Qwen2-7B
  • Parameter Count: 7.6 billion
  • Context Length: 131,072 tokens
  • Prompt Template: Uses the ChatML format, which includes system, user, and assistant roles for structured conversations.

Quantized Versions

Quantized GGUF versions of this model are available for more efficient deployment and inference, accessible at MaziyarPanahi/calme-2.4-qwen2-7b-GGUF.

Intended Use

This model is suitable for general-purpose language generation and understanding tasks where improved performance over the base Qwen2-7B model is desired, particularly in scenarios requiring instruction following and reasoning capabilities.