google/gemma-2-2b-jpn-it

TEXT GENERATIONConcurrency Cost:1Model Size:2.6BQuant:BF16Ctx Length:8kPublished:Sep 25, 2024License:gemmaArchitecture:Transformer0.2K Gated Cold

Google's Gemma-2-2B-JPN-IT is a 2.6 billion parameter instruction-tuned, decoder-only large language model from the Gemma 2 family, specifically fine-tuned for Japanese text. It supports a context length of 8192 tokens and is designed to perform at the same level for Japanese queries as English-only Gemma 2 models. This model excels in various Japanese text generation tasks, including question answering, summarization, and reasoning, and can also be used for translation.

Loading preview...

Model Overview

Google's Gemma-2-2B-JPN-IT is a 2.6 billion parameter instruction-tuned model, part of the Gemma 2 series, which draws inspiration from the Gemini family of models. It is a text-to-text, decoder-only large language model with open weights, specifically fine-tuned on Japanese text. A key differentiator is its optimized performance for the Japanese language, aiming to match the capabilities of English-only Gemma 2 models for Japanese queries.

Key Capabilities

  • Japanese Language Proficiency: Fine-tuned to support Japanese with performance comparable to English-only Gemma 2 models.
  • Text Generation: Capable of various text generation tasks, including question answering, summarization, and reasoning in Japanese.
  • Translation: Demonstrated ability to translate between Japanese and English.
  • Robust Training: Trained on a diverse dataset totaling 8 trillion tokens, including web documents, code, mathematics, and large-scale Japanese and multilingual instruction data.

Evaluation and Safety

The model's quality was assessed using an LLM-as-a-judge approach against GPT-3.5 for Japanese prompts, showing a preference score of 0.03 ± 0.04. It also achieved 98.24% language correctness for Japanese prompts. Rigorous safety measures, including CSAM and sensitive data filtering, were applied during data preprocessing. Ethical considerations like bias, misinformation, and privacy were addressed through evaluation and mitigation strategies.

Intended Usage

This model is well-suited for content creation (poems, scripts, marketing copy), chatbots, text summarization, NLP research, language learning tools, and knowledge exploration, particularly for applications requiring strong Japanese language capabilities.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p