anakin87/gemma-2-9b-neogenesis-ita
TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Jan 3, 2025License:gemmaArchitecture:Transformer0.0K Warm
anakin87/gemma-2-9b-neogenesis-ita is a 9.24 billion parameter Gemma 2 model, fine-tuned for enhanced performance in Italian. This model leverages Direct Preference Optimization and the Spectrum technique, focusing training on the top 20% most informative layers. It excels in Italian language tasks, outperforming many larger models on the Open Ita LLM Leaderboard, and supports an 8K context length.
Loading preview...
Gemma 2 9B Neogenesis ITA Overview
This model, anakin87/gemma-2-9b-neogenesis-ita, is a fine-tuned version of the Gemma 2 9B architecture, specifically optimized for the Italian language. It builds upon VAGOsolutions/SauerkrautLM-gemma-2-9b-it to deliver superior performance in Italian contexts.
Key Capabilities & Features
- Enhanced Italian Performance: Achieves strong results on the Open Ita LLM Leaderboard, surpassing the base Gemma 2 9B model and even some 13-14B and 30-70B models in Italian benchmarks like MMLU_IT, ARC_IT, and HELLASWAG_IT.
- Efficient Fine-tuning: Utilizes Direct Preference Optimization (DPO) with the Spectrum technique, which focuses training on the top 20% most informative layers, freezing the rest for parameter-efficient learning.
- Training Data: Primarily trained on Italian datasets, including
mii-llm/argilla-math-preferences-it,ruggsea/wsdm2024-cot-dataset, andanakin87/evol-dpo-ita-reranked, with a small portion of English data. - Context Length: Supports an 8K context length.
Ideal Use Cases
- Italian Language Generation: Excellent for tasks requiring high-quality text generation, summarization, or conversational AI in Italian.
- Research and Development: Suitable for developers and researchers focusing on Italian NLP applications, offering a strong baseline for further fine-tuning or integration.
- Educational Content: Can be used for creating explanations or educational materials in Italian, as demonstrated by its ability to explain complex topics like compound interest.