SeacomSrl/SeaPhi3-mini
SeaPhi3-mini is a 4 billion parameter language model developed by Toti Riccardo, fine-tuned from Microsoft's Phi-3-mini-128k-instruct. This model is specifically adapted for Italian language tasks, leveraging a translated dataset for improved performance in Italian contexts. It is designed for applications requiring a compact yet capable model with a 4096-token context length, particularly in Italian-centric natural language processing.
Loading preview...
SeaPhi3-mini: An Italian-Optimized Phi-3 Variant
SeaPhi3-mini is a 4 billion parameter language model developed by Toti Riccardo, building upon Microsoft's Phi-3-mini-128k-instruct. This model has been specifically fine-tuned using the SeacomSrl/rag-data dataset, which consists of Italian translated data, to enhance its performance in Italian language understanding and generation tasks.
Key Capabilities
- Italian Language Proficiency: Optimized for processing and generating text in Italian, making it suitable for Italian-specific applications.
- Compact Size: With 4 billion parameters, it offers a balance between performance and computational efficiency.
- Context Length: Supports a context window of 4096 tokens, allowing for processing moderately long inputs.
Performance Metrics (Italian Benchmarks)
Initial evaluations on Italian benchmarks indicate its current performance:
- Hellaswag_it: Achieves an accuracy of 0.4502 and a normalized accuracy of 0.5994.
- Arc_it: Shows an accuracy of 0.0813 and a normalized accuracy of 0.4243.
Good For
- Applications requiring a small, efficient model for Italian text processing.
- Research and development in Italian natural language understanding.
- Use cases where a fine-tuned model on Italian data is beneficial over general-purpose models.