alfredplpl/Llama-3-8B-Instruct-Ja

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 22, 2024License:llama3Architecture:Transformer0.0K Warm

alfredplpl/Llama-3-8B-Instruct-Ja is an 8 billion parameter instruction-tuned causal language model, based on Meta's Llama 3 architecture, specifically optimized for Japanese language processing. This model enhances the original Llama 3's capabilities for Japanese by undergoing further instruction tuning on Japanese datasets. It is designed for various Japanese natural language understanding and generation tasks, offering a robust solution for applications requiring strong Japanese linguistic performance.

Loading preview...

Overview

alfredplpl/Llama-3-8B-Instruct-Ja is an 8 billion parameter instruction-tuned language model derived from Meta's Llama 3 architecture, specifically adapted for the Japanese language. The model has undergone a two-stage instruction tuning process to significantly improve its Japanese language capabilities.

Key Capabilities

  • Enhanced Japanese Performance: The model was fine-tuned using approximately 2.4 million Japanese question-answering pairs from cl-nagoya/auto-wiki-qa and further refined with llm-jp/databricks-dolly-15k-ja.
  • Instruction Following: It is instruction-tuned to respond effectively to user prompts, making it suitable for conversational AI and task-oriented applications.
  • Commercial Use: The model adheres to the Llama 3 license, permitting commercial use.

Training Details

The training involved LoRA-based instruction tuning on an NVIDIA A6000x2 setup, accumulating 60 GPU hours. The process first merged LoRA adapters after training on cl-nagoya/auto-wiki-qa for one epoch, then repeated the merge after five epochs of training on llm-jp/databricks-dolly-15k-ja.

Good For

  • Japanese AI Assistants: Ideal for building AI assistants that require strong Japanese language understanding and generation.
  • Japanese Content Creation: Suitable for generating various forms of Japanese text, from creative writing to informative responses.
  • Research and Development: Provides a solid base for further research and development in Japanese large language models.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p