Maestrale Chat v0.4 Alpha SFT: An Italian-Optimized Mistral-7b Variant
Maestrale chat v0.4 alpha SFT is a 7 billion parameter language model developed by @efederici and @mferraretto. It is built upon the Mistral-7b architecture, undergoing continued pre-training with a curated, large-scale, high-quality Italian corpus, and merged with the occiglot-7b-eu5 model. The model has been fine-tuned using Supervised Fine-Tuning (SFT) on 1.7 million conversations/instructions over two epochs.
Key Capabilities and Improvements (v0.4):
- Italian Language Focus: Optimized for high performance in Italian, leveraging extensive pre-training on relevant datasets.
- Enhanced Reasoning: Demonstrates improved capabilities in mathematical and general reasoning tasks.
- Truthfulness: Features improved truthfulness in its responses.
- Agentic Behavior: Incorporates agent-like functionalities.
- Mermaid Mindmaps: Capable of generating Mermaid mindmaps.
- Literary Tasks: Shows proficiency in tasks such as Latin translations and poem generation.
Performance Highlights:
Preliminary scores indicate competitive performance on Italian benchmarks:
- Hellaswag_it: 0.5220 accuracy (0.6887 acc_norm)
- ARC_it: 0.1762 accuracy (0.5090 acc_norm)
- M_MMLU_it: 0.569 accuracy
Intended Use:
This model is designed for chat and instruction-following applications, particularly where strong Italian language understanding and generation are required. While an alpha version, it can refuse to answer certain prompts, indicating some safety considerations.