Overview
Llama-3-KoEn-8B-Instruct-preview Overview
This model, developed by beomi, is an instruction-tuned variant of the Llama-3-8B architecture, featuring 8 billion parameters. It has undergone continued pre-training, specifically focusing on Korean and English language capabilities. The development utilized TPUv4-256 with support from Google's TRC program.
Key Characteristics
- Base Model: Built upon Meta's Llama-3-8B.
- Multilingual Focus: Designed for both Korean and English language processing.
- Instruction Tuning: Incorporates ideas from the Chat Vector paper for instruction following.
- Preview Status: While instruction-tuned, this preview version has not been fine-tuned with a dedicated Korean instruction set.
- Context Length: Supports an 8192-token context window.
Primary Use Case
This model is intended as an excellent foundational model for developers looking to create new Korean chat and instruction-following models. Its pre-training on Korean and English data makes it suitable for applications requiring bilingual understanding and generation, particularly in conversational AI and instruction-based tasks.