SeacomSrl/SeaPhi3-mini

Warm
Public
4B
BF16
4096
License: apache-2.0
Hugging Face
Overview

SeaPhi3-mini: An Italian-Optimized Phi-3 Variant

SeaPhi3-mini is a 4 billion parameter language model developed by Toti Riccardo, building upon Microsoft's Phi-3-mini-128k-instruct. This model has been specifically fine-tuned using the SeacomSrl/rag-data dataset, which consists of Italian translated data, to enhance its performance in Italian language understanding and generation tasks.

Key Capabilities

  • Italian Language Proficiency: Optimized for processing and generating text in Italian, making it suitable for Italian-specific applications.
  • Compact Size: With 4 billion parameters, it offers a balance between performance and computational efficiency.
  • Context Length: Supports a context window of 4096 tokens, allowing for processing moderately long inputs.

Performance Metrics (Italian Benchmarks)

Initial evaluations on Italian benchmarks indicate its current performance:

  • Hellaswag_it: Achieves an accuracy of 0.4502 and a normalized accuracy of 0.5994.
  • Arc_it: Shows an accuracy of 0.0813 and a normalized accuracy of 0.4243.

Good For

  • Applications requiring a small, efficient model for Italian text processing.
  • Research and development in Italian natural language understanding.
  • Use cases where a fine-tuned model on Italian data is beneficial over general-purpose models.