MaziyarPanahi/calme-3.3-instruct-3b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kPublished:Nov 7, 2024License:qwen-researchArchitecture:Transformer0.0K Warm

MaziyarPanahi/calme-3.3-instruct-3b is a 3.1 billion parameter instruction-tuned causal language model developed by MaziyarPanahi. It is an advanced iteration of Qwen/Qwen2.5-3B, specifically fine-tuned to enhance its capabilities across generic domains. This model is designed for general-purpose conversational AI and text generation tasks, offering a balance between size and performance for various applications.

Loading preview...

Model Overview

MaziyarPanahi/calme-3.3-instruct-3b is a 3.1 billion parameter instruction-tuned language model, building upon the Qwen/Qwen2.5-3B architecture. Developed by MaziyarPanahi, this iteration focuses on enhancing generic domain capabilities through fine-tuning.

Key Characteristics

  • Base Model: Fine-tuned from Qwen/Qwen2.5-3B.
  • Parameter Count: 3.1 billion parameters, offering a compact yet capable model size.
  • Context Length: Supports a context window of 32768 tokens.
  • Instruction Following: Utilizes the ChatML prompt template for structured instruction-following.
  • Quantized Versions: GGUF quantized models are available for efficient deployment.

Performance Insights

Evaluations on the Open LLM Leaderboard indicate an average score of 21.55. Specific metrics include 64.23 on IFEval (0-Shot) and 25.68 on BBH (3-Shot). It's noted as a relatively small model, which may impact performance on complex prompts and make it sensitive to hyperparameters.

Use Cases

This model is suitable for a range of general-purpose text generation and conversational AI tasks where a smaller, efficient model is preferred. Users should consider its size and evaluate performance for specific applications, especially those requiring high accuracy in complex reasoning or mathematical domains.