MaziyarPanahi/calme-2.1-qwen2-72b

Cold
Public
72.7B
FP8
131072
License: tongyi-qianwen
Hugging Face
Overview

Model Overview

MaziyarPanahi/calme-2.1-qwen2-72b is a fine-tuned version of the Qwen/Qwen2-72B-Instruct model, developed by Maziyar Panahi. This 72.7 billion parameter model focuses on enhancing natural language understanding and generation capabilities across various tasks. It supports a substantial context length of 131,072 tokens, making it suitable for processing extensive inputs.

Key Capabilities

  • Advanced Question-Answering: Designed for complex information retrieval and response generation.
  • Intelligent Chatbots & Virtual Assistants: Capable of engaging in sophisticated conversational interactions.
  • Content Generation & Summarization: Efficiently creates and condenses textual content.
  • Code Generation & Analysis: Supports the creation and understanding of programming code.
  • Complex Problem-Solving: Aims to assist in decision support and intricate problem resolution.

Performance Highlights

Evaluations on the Open LLM Leaderboard show an average score of 43.61. Specific task performance includes:

  • IFEval (0-Shot): 81.63
  • BBH (3-Shot): 57.33
  • MATH Lvl 5 (4-Shot): 36.03
  • MMLU-PRO (5-shot): 49.05
  • GSM8K (5-shot): 0.8582 (strict-match)

Prompt Template

The model utilizes the ChatML prompt template for structured input and output.

Ethical Considerations

Users are advised to be aware of potential biases and limitations inherent in large language models and to implement appropriate safeguards and human oversight in production environments.