Overview
Model Overview
The e-palmisano/Qwen2-0.5B-ITA-Instruct is a 0.5 billion parameter language model developed by e-palmisano, fine-tuned from unsloth/Qwen2-0.5B-Instruct-bnb-4bit. This model was specifically enhanced for the Italian language through a two-stage fine-tuning process. The initial stage involved continuous pretraining with Unsloth on a subset of the gsarti/clean_mc4_it dataset (100k rows) to boost its Italian linguistic understanding. The second stage focused on instruction-following capabilities, utilizing the FreedomIntelligence/alpaca-gpt4-italian dataset.
Key Capabilities
- Italian Language Proficiency: Significantly improved understanding and generation of Italian text due to targeted fine-tuning.
- Instruction Following: Enhanced ability to respond to instructions and prompts in Italian.
- Efficient Training: Leverages Unsloth for faster training, making it a more accessible option for Italian-centric applications.
Evaluation
Performance metrics for the model on Italian language benchmarks include:
- hellaswag_it acc_norm: 36.28
- arc_it acc_norm: 27.63
- m_mmlu_it 5-shot acc: 35.4
- Average: 33.1
For a comprehensive comparison, users can refer to the Leaderboard for Italian Language Models.
Good For
- Applications requiring strong Italian language generation and comprehension.
- Instruction-based tasks and chatbots in Italian.
- Developers looking for an efficient, Italian-optimized Qwen2-based model.