e-palmisano/Qwen2-0.5B-ITA-Instruct

Warm
Public
0.5B
BF16
32768
License: apache-2.0
Hugging Face
Overview

Model Overview

The e-palmisano/Qwen2-0.5B-ITA-Instruct is a 0.5 billion parameter language model developed by e-palmisano, fine-tuned from unsloth/Qwen2-0.5B-Instruct-bnb-4bit. This model was specifically enhanced for the Italian language through a two-stage fine-tuning process. The initial stage involved continuous pretraining with Unsloth on a subset of the gsarti/clean_mc4_it dataset (100k rows) to boost its Italian linguistic understanding. The second stage focused on instruction-following capabilities, utilizing the FreedomIntelligence/alpaca-gpt4-italian dataset.

Key Capabilities

  • Italian Language Proficiency: Significantly improved understanding and generation of Italian text due to targeted fine-tuning.
  • Instruction Following: Enhanced ability to respond to instructions and prompts in Italian.
  • Efficient Training: Leverages Unsloth for faster training, making it a more accessible option for Italian-centric applications.

Evaluation

Performance metrics for the model on Italian language benchmarks include:

  • hellaswag_it acc_norm: 36.28
  • arc_it acc_norm: 27.63
  • m_mmlu_it 5-shot acc: 35.4
  • Average: 33.1

For a comprehensive comparison, users can refer to the Leaderboard for Italian Language Models.

Good For

  • Applications requiring strong Italian language generation and comprehension.
  • Instruction-based tasks and chatbots in Italian.
  • Developers looking for an efficient, Italian-optimized Qwen2-based model.