MaziyarPanahi/calme-3.2-baguette-3b: A Dual-Language Fine-Tune
This model, developed by MaziyarPanahi, is a 3.1 billion parameter language model built upon the Qwen/Qwen2.5-3B architecture. It has undergone specific fine-tuning to improve its general domain capabilities in both French and English, making it suitable for applications requiring proficiency in these two languages.
Key Capabilities & Features
- Dual-Language Proficiency: Enhanced performance in both French and English general domains.
- Base Model: Leverages the robust Qwen/Qwen2.5-3B as its foundation.
- Quantized Versions: GGUF quantized models are available for efficient deployment.
- Prompt Template: Utilizes the
ChatML prompt format for structured interactions.
Performance Insights
Evaluations on the Open LLM Leaderboard show an average score of 22.14. Specific metrics include IFEval (0-Shot) at 63.38, BBH (3-Shot) at 25.87, and MMLU-PRO (5-shot) at 25.98. It's important to note that as a smaller model, its performance can be sensitive to hyperparameters and specific prompts.
Good For
- Applications requiring a compact model with enhanced French and English language understanding.
- Experimentation with fine-tuned Qwen2.5-3B variants.
- Use cases where a balance between model size and dual-language capability is crucial.