anakin87/gemma-2-9b-neogenesis-ita

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:9BQuant:FP8Ctx Length:16kPublished:Jan 3, 2025License:gemmaArchitecture:Transformer0.0K Warm

anakin87/gemma-2-9b-neogenesis-ita is a 9.24 billion parameter Gemma 2 model, fine-tuned for enhanced performance in Italian. This model leverages Direct Preference Optimization and the Spectrum technique, focusing training on the top 20% most informative layers. It excels in Italian language tasks, outperforming many larger models on the Open Ita LLM Leaderboard, and supports an 8K context length.

Loading preview...

Gemma 2 9B Neogenesis ITA Overview

This model, anakin87/gemma-2-9b-neogenesis-ita, is a fine-tuned version of the Gemma 2 9B architecture, specifically optimized for the Italian language. It builds upon VAGOsolutions/SauerkrautLM-gemma-2-9b-it to deliver superior performance in Italian contexts.

Key Capabilities & Features

  • Enhanced Italian Performance: Achieves strong results on the Open Ita LLM Leaderboard, surpassing the base Gemma 2 9B model and even some 13-14B and 30-70B models in Italian benchmarks like MMLU_IT, ARC_IT, and HELLASWAG_IT.
  • Efficient Fine-tuning: Utilizes Direct Preference Optimization (DPO) with the Spectrum technique, which focuses training on the top 20% most informative layers, freezing the rest for parameter-efficient learning.
  • Training Data: Primarily trained on Italian datasets, including mii-llm/argilla-math-preferences-it, ruggsea/wsdm2024-cot-dataset, and anakin87/evol-dpo-ita-reranked, with a small portion of English data.
  • Context Length: Supports an 8K context length.

Ideal Use Cases

  • Italian Language Generation: Excellent for tasks requiring high-quality text generation, summarization, or conversational AI in Italian.
  • Research and Development: Suitable for developers and researchers focusing on Italian NLP applications, offering a strong baseline for further fine-tuning or integration.
  • Educational Content: Can be used for creating explanations or educational materials in Italian, as demonstrated by its ability to explain complex topics like compound interest.