beomi/Llama-3-KoEn-8B-Instruct-preview

Warm
Public
8B
FP8
8192
1
May 1, 2024
License: cc-by-nc-sa-4.0
Hugging Face
Overview

Llama-3-KoEn-8B-Instruct-preview Overview

This model, developed by beomi, is an instruction-tuned variant of the Llama-3-8B architecture, featuring 8 billion parameters. It has undergone continued pre-training, specifically focusing on Korean and English language capabilities. The development utilized TPUv4-256 with support from Google's TRC program.

Key Characteristics

  • Base Model: Built upon Meta's Llama-3-8B.
  • Multilingual Focus: Designed for both Korean and English language processing.
  • Instruction Tuning: Incorporates ideas from the Chat Vector paper for instruction following.
  • Preview Status: While instruction-tuned, this preview version has not been fine-tuned with a dedicated Korean instruction set.
  • Context Length: Supports an 8192-token context window.

Primary Use Case

This model is intended as an excellent foundational model for developers looking to create new Korean chat and instruction-following models. Its pre-training on Korean and English data makes it suitable for applications requiring bilingual understanding and generation, particularly in conversational AI and instruction-based tasks.