Overview
SeaPhi3-mini: An Italian-Optimized Phi-3 Variant
SeaPhi3-mini is a 4 billion parameter language model developed by Toti Riccardo, building upon Microsoft's Phi-3-mini-128k-instruct. This model has been specifically fine-tuned using the SeacomSrl/rag-data dataset, which consists of Italian translated data, to enhance its performance in Italian language understanding and generation tasks.
Key Capabilities
- Italian Language Proficiency: Optimized for processing and generating text in Italian, making it suitable for Italian-specific applications.
- Compact Size: With 4 billion parameters, it offers a balance between performance and computational efficiency.
- Context Length: Supports a context window of 4096 tokens, allowing for processing moderately long inputs.
Performance Metrics (Italian Benchmarks)
Initial evaluations on Italian benchmarks indicate its current performance:
- Hellaswag_it: Achieves an accuracy of 0.4502 and a normalized accuracy of 0.5994.
- Arc_it: Shows an accuracy of 0.0813 and a normalized accuracy of 0.4243.
Good For
- Applications requiring a small, efficient model for Italian text processing.
- Research and development in Italian natural language understanding.
- Use cases where a fine-tuned model on Italian data is beneficial over general-purpose models.