e-palmisano/Qwen2-0.5B-ITA-Instruct
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:0.5BQuant:BF16Ctx Length:32kPublished:Jul 3, 2024License:apache-2.0Architecture:Transformer Open Weights Warm

The e-palmisano/Qwen2-0.5B-ITA-Instruct is a 0.5 billion parameter Qwen2-based causal language model developed by e-palmisano. It has been fine-tuned using Unsloth for improved Italian language capabilities and instruction following, leveraging the gsarti/clean_mc4_it and FreedomIntelligence/alpaca-gpt4-italian datasets. This model is optimized for Italian language tasks and instruction-based interactions, offering a context length of 32768 tokens.

Loading preview...

Model Overview

The e-palmisano/Qwen2-0.5B-ITA-Instruct is a 0.5 billion parameter language model developed by e-palmisano, fine-tuned from unsloth/Qwen2-0.5B-Instruct-bnb-4bit. This model was specifically enhanced for the Italian language through a two-stage fine-tuning process. The initial stage involved continuous pretraining with Unsloth on a subset of the gsarti/clean_mc4_it dataset (100k rows) to boost its Italian linguistic understanding. The second stage focused on instruction-following capabilities, utilizing the FreedomIntelligence/alpaca-gpt4-italian dataset.

Key Capabilities

  • Italian Language Proficiency: Significantly improved understanding and generation of Italian text due to targeted fine-tuning.
  • Instruction Following: Enhanced ability to respond to instructions and prompts in Italian.
  • Efficient Training: Leverages Unsloth for faster training, making it a more accessible option for Italian-centric applications.

Evaluation

Performance metrics for the model on Italian language benchmarks include:

  • hellaswag_it acc_norm: 36.28
  • arc_it acc_norm: 27.63
  • m_mmlu_it 5-shot acc: 35.4
  • Average: 33.1

For a comprehensive comparison, users can refer to the Leaderboard for Italian Language Models.

Good For

  • Applications requiring strong Italian language generation and comprehension.
  • Instruction-based tasks and chatbots in Italian.
  • Developers looking for an efficient, Italian-optimized Qwen2-based model.