MaziyarPanahi/calme-2.8-qwen2-7b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 27, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

MaziyarPanahi/calme-2.8-qwen2-7b is a 7.6 billion parameter language model fine-tuned by Maziyar Panahi, based on the Qwen/Qwen2-7B architecture. This model aims to enhance the base model's performance across various benchmarks, utilizing a 131,072 token context length. It is optimized for general language understanding and generation tasks, with a focus on improving overall benchmark scores.

Loading preview...

Overview

MaziyarPanahi/calme-2.8-qwen2-7b is a fine-tuned iteration of the Qwen/Qwen2-7B model, developed by Maziyar Panahi. This 7.6 billion parameter model is designed to improve upon the base model's performance across a range of benchmarks, leveraging a substantial 131,072 token context window. It utilizes the ChatML prompt template for structured interactions.

Key Capabilities

  • Enhanced General Performance: Aims to improve the base Qwen2-7B model's capabilities across various language tasks.
  • Large Context Window: Supports a context length of 131,072 tokens, enabling processing of extensive inputs.
  • ChatML Prompting: Designed to work with the ChatML format for clear system, user, and assistant turns.
  • Quantized GGUF Versions: Available in optimized GGUF formats for efficient deployment and inference on various hardware.

Benchmark Performance

Evaluations on the Open LLM Leaderboard show an average score of 19.22. Specific metrics include:

  • IFEval (0-Shot): 27.75
  • BBH (3-Shot): 25.53
  • MATH Lvl 5 (4-Shot): 15.63
  • GPQA (0-shot): 5.82
  • MuSR (0-shot): 12.06
  • MMLU-PRO (5-shot): 28.51

Good for

  • Developers seeking a fine-tuned Qwen2-7B variant with improved general performance.
  • Applications requiring a model with a very large context window.
  • Use cases where ChatML prompting is preferred for structured conversational AI.