MaziyarPanahi/calme-3.2-baguette-3b

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.1BQuant:BF16Ctx Length:32kLicense:qwen-researchArchitecture:Transformer0.0K Warm

The MaziyarPanahi/calme-3.2-baguette-3b is a 3.1 billion parameter causal language model, an advanced iteration of Qwen/Qwen2.5-3B. It is specifically fine-tuned to enhance its capabilities across general domains in both French and English. This model is designed for dual-language applications, offering improved performance in multilingual contexts.

Loading preview...

MaziyarPanahi/calme-3.2-baguette-3b: A Dual-Language Fine-Tune

This model, developed by MaziyarPanahi, is a 3.1 billion parameter language model built upon the Qwen/Qwen2.5-3B architecture. It has undergone specific fine-tuning to improve its general domain capabilities in both French and English, making it suitable for applications requiring proficiency in these two languages.

Key Capabilities & Features

  • Dual-Language Proficiency: Enhanced performance in both French and English general domains.
  • Base Model: Leverages the robust Qwen/Qwen2.5-3B as its foundation.
  • Quantized Versions: GGUF quantized models are available for efficient deployment.
  • Prompt Template: Utilizes the ChatML prompt format for structured interactions.

Performance Insights

Evaluations on the Open LLM Leaderboard show an average score of 22.14. Specific metrics include IFEval (0-Shot) at 63.38, BBH (3-Shot) at 25.87, and MMLU-PRO (5-shot) at 25.98. It's important to note that as a smaller model, its performance can be sensitive to hyperparameters and specific prompts.

Good For

  • Applications requiring a compact model with enhanced French and English language understanding.
  • Experimentation with fine-tuned Qwen2.5-3B variants.
  • Use cases where a balance between model size and dual-language capability is crucial.