anakin87/Phi-3.5-mini-ITA

Warm
Public
4B
BF16
4096
License: mit
Hugging Face
Overview

anakin87/Phi-3.5-mini-ITA: Italian Optimized LLM

This model is a fine-tuned version of Microsoft's Phi-3.5-mini-instruct, specifically engineered to deliver superior performance in the Italian language. With 3.82 billion parameters and a substantial 128k context length, it offers a powerful yet compact solution for Italian natural language processing tasks.

Key Capabilities & Features

  • Italian Language Proficiency: Optimized for Italian, demonstrating strong performance on Italian-specific benchmarks.
  • Compact Size: A 3.82 billion parameter model, making it efficient for deployment and use on more constrained hardware, including local environments like Colab.
  • Extended Context Window: Supports a 128k context length, allowing for processing longer texts and maintaining conversational coherence over extended interactions.
  • Benchmark Performance: Achieves an average score of 57.67 on the Open ITA LLM Leaderboard and 57.95 on the Pinocchio ITA Leaderboard, surpassing the larger Meta-Llama-3.1-8B-Instruct in Italian evaluations.
  • Efficient Training: Utilizes the Spectrum technique for parameter-efficient learning, focusing training on high Signal-to-Noise Ratio layers.
  • Inference Acceleration: Compatible with Flash Attention 2 for faster inference.

Good For

  • Italian Language Applications: Ideal for chatbots, content generation, summarization, and other NLP tasks requiring high accuracy in Italian.
  • Resource-Constrained Environments: Its small size allows for smooth operation on consumer-grade GPUs and platforms like Google Colab.
  • Developers Building AI Applications: Can be integrated with frameworks like Haystack for building RAG systems, summarization tools, and multilingual applications.