MaziyarPanahi/calme-3.2-instruct-3b

Warm
Public
3.1B
BF16
32768
License: qwen-research
Hugging Face
Overview

Overview

MaziyarPanahi/calme-3.2-instruct-3b is a 3.1 billion parameter instruction-tuned language model, developed by MaziyarPanahi. It is an advanced fine-tuned version of the Qwen/Qwen2.5-3B base model, specifically enhanced for improved performance across generic domains. The model supports a substantial context length of 32768 tokens.

Key Capabilities

  • Instruction Following: Fine-tuned to respond effectively to a wide range of instructions.
  • Generic Domain Enhancement: Optimized for general-purpose applications, aiming for broad utility.
  • Quantized Versions Available: GGUF quantized models are provided for efficient deployment on various hardware.
  • ChatML Prompt Template: Utilizes the ChatML format for structured conversational interactions.

Performance Insights

Evaluated on the Open LLM Leaderboard, the model achieved an average score of 22.66. Specific metrics include:

  • IFEval (0-Shot): 55.33
  • BBH (3-Shot): 27.98
  • MMLU-PRO (5-shot): 29.48

Considerations

As a relatively small model, calme-3.2-instruct-3b may exhibit sensitivity to hyper-parameters and might not perform optimally for all complex prompts. Users are encouraged to provide feedback for future iterations. Ethical considerations regarding potential biases and limitations, common to large language models, should be noted, recommending safeguards and human oversight in production environments.