e-palmisano/Qwen2-1.5B-ITA-Instruct

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Jul 2, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

e-palmisano/Qwen2-1.5B-ITA-Instruct is a 1.5 billion parameter Qwen2-based causal language model developed by e-palmisano. It has been fine-tuned specifically to improve Italian language capabilities using the gsarti/clean_mc4_it and FreedomIntelligence/alpaca-gpt4-italian datasets. This model is optimized for Italian language understanding and instruction-following tasks, offering a specialized solution for Italian NLP applications. It leverages Unsloth for faster training and supports a context length of 131072 tokens.

Loading preview...

Overview

e-palmisano/Qwen2-1.5B-ITA-Instruct is a 1.5 billion parameter Qwen2-based instruction-tuned language model developed by e-palmisano. It was fine-tuned from unsloth/Qwen2-1.5B-Instruct-bnb-4bit with a focus on enhancing its proficiency in the Italian language. The training involved continuous pretraining on 100k rows of the gsarti/clean_mc4_it dataset, followed by instruction-tuning on the FreedomIntelligence/alpaca-gpt4-italian dataset. This model benefits from being trained 2x faster using Unsloth and Huggingface's TRL library.

Key Capabilities

  • Italian Language Proficiency: Specifically fine-tuned to improve understanding and generation in Italian.
  • Instruction Following: Designed to respond effectively to instructions in Italian, leveraging the Alpaca-GPT4 Italian dataset.
  • Efficient Training: Utilizes Unsloth for accelerated training, making it a resource-efficient option.

Performance

Evaluation on Italian language benchmarks shows the model's performance:

  • hellaswag_it acc_norm: 48.05
  • arc_it acc_norm: 32.68
  • m_mmlu_it 5-shot acc: 46.89
  • Average Accuracy Normalized: 42.57

For a comprehensive comparison, users can refer to the Leaderboard for Italian Language Models.

Good for

  • Applications requiring strong Italian language understanding and generation.
  • Instruction-based tasks and chatbots in Italian.
  • Developers looking for a specialized, efficiently trained Italian LLM.