mii-community/zefiro-7b-dpo-ITA
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:8kPublished:Feb 20, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm
Zefiro-7b-dpo-ITA is a 7 billion parameter DPO fine-tuned causal language model developed by giux78, specifically optimized for the Italian language. Based on Zefiro-7b-sft-ITA and inspired by the Zephyr model, it excels in conversational tasks in Italian. This model offers strong performance in Italian language understanding and generation, making it suitable for various Italian NLP applications.
Loading preview...
Zefiro-7b-dpo-ITA: A DPO Fine-Tuned Italian LLM
Zefiro-7b-dpo-ITA is a 7 billion parameter GPT-like model developed by giux78, specifically fine-tuned for the Italian language using Direct Preference Optimization (DPO). It builds upon the Zefiro-7b-sft-ITA model and draws inspiration from the Zephyr and LLaMAntino models.
Key Capabilities & Training:
- Italian Language Specialization: Primarily focused on Italian, making it highly effective for Italian NLP tasks.
- DPO Fine-Tuning: Utilizes DPO on a filtered version of the ultrafeedback-preferences-ITA dataset, enhancing its conversational abilities.
- Performance: Achieves competitive scores on Italian benchmarks, with an average of 56.86 across Arc-c, HellaS, and MMUL, outperforming its base and SFT versions.
- Training Data: Trained using a translated version of the UltraChat dataset, with careful consideration for translation quality.
Intended Uses:
- Conversational AI: Ideal as a base model for developing more specific conversational agents in Italian.
- Italian NLP Applications: Suitable for various tasks requiring strong Italian language understanding and generation.
Limitations:
- The model has not undergone human preference alignment for safety beyond the DPO phase, and thus may produce problematic outputs if prompted to do so.
- The exact composition of the base model's training corpus is unknown, but likely includes web data and technical sources.