e-palmisano/Qwen2-1.5B-ITA-Instruct
e-palmisano/Qwen2-1.5B-ITA-Instruct is a 1.5 billion parameter Qwen2-based causal language model developed by e-palmisano. It has been fine-tuned specifically to improve Italian language capabilities using the gsarti/clean_mc4_it and FreedomIntelligence/alpaca-gpt4-italian datasets. This model is optimized for Italian language understanding and instruction-following tasks, offering a specialized solution for Italian NLP applications. It leverages Unsloth for faster training and supports a context length of 131072 tokens.
Loading preview...
Overview
e-palmisano/Qwen2-1.5B-ITA-Instruct is a 1.5 billion parameter Qwen2-based instruction-tuned language model developed by e-palmisano. It was fine-tuned from unsloth/Qwen2-1.5B-Instruct-bnb-4bit with a focus on enhancing its proficiency in the Italian language. The training involved continuous pretraining on 100k rows of the gsarti/clean_mc4_it dataset, followed by instruction-tuning on the FreedomIntelligence/alpaca-gpt4-italian dataset. This model benefits from being trained 2x faster using Unsloth and Huggingface's TRL library.
Key Capabilities
- Italian Language Proficiency: Specifically fine-tuned to improve understanding and generation in Italian.
- Instruction Following: Designed to respond effectively to instructions in Italian, leveraging the Alpaca-GPT4 Italian dataset.
- Efficient Training: Utilizes Unsloth for accelerated training, making it a resource-efficient option.
Performance
Evaluation on Italian language benchmarks shows the model's performance:
hellaswag_it acc_norm: 48.05arc_it acc_norm: 32.68m_mmlu_it 5-shot acc: 46.89- Average Accuracy Normalized: 42.57
For a comprehensive comparison, users can refer to the Leaderboard for Italian Language Models.
Good for
- Applications requiring strong Italian language understanding and generation.
- Instruction-based tasks and chatbots in Italian.
- Developers looking for a specialized, efficiently trained Italian LLM.