beomi/Llama-3-KoEn-8B-Instruct-preview

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:May 1, 2024License:llama3Architecture:Transformer0.0K Warm

beomi/Llama-3-KoEn-8B-Instruct-preview is an 8 billion parameter instruction-tuned language model developed by beomi, based on Meta's Llama-3-8B. This model is continued pre-trained and specifically designed for Korean and English language tasks, leveraging the Chat Vector paper's methodology. It serves as a strong starting point for creating new Korean chat and instruction models, despite not being fine-tuned with a dedicated Korean instruction set in this preview version.

Loading preview...

Llama-3-KoEn-8B-Instruct-preview Overview

This model, developed by beomi, is an instruction-tuned variant of the Llama-3-8B architecture, featuring 8 billion parameters. It has undergone continued pre-training, specifically focusing on Korean and English language capabilities. The development utilized TPUv4-256 with support from Google's TRC program.

Key Characteristics

  • Base Model: Built upon Meta's Llama-3-8B.
  • Multilingual Focus: Designed for both Korean and English language processing.
  • Instruction Tuning: Incorporates ideas from the Chat Vector paper for instruction following.
  • Preview Status: While instruction-tuned, this preview version has not been fine-tuned with a dedicated Korean instruction set.
  • Context Length: Supports an 8192-token context window.

Primary Use Case

This model is intended as an excellent foundational model for developers looking to create new Korean chat and instruction-following models. Its pre-training on Korean and English data makes it suitable for applications requiring bilingual understanding and generation, particularly in conversational AI and instruction-based tasks.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p