beomi/Llama-3-Open-Ko-8B-Instruct-preview

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 23, 2024License:otherArchitecture:Transformer0.1K Warm

beomi/Llama-3-Open-Ko-8B-Instruct-preview is an 8 billion parameter instruction-tuned language model developed by beomi, based on the Llama-3-8B architecture. This model is continued pre-trained using publicly available resources, including over 60GB of deduplicated Korean texts, and utilizes the new Llama-3 tokenizer with 17.7B+ tokens. It serves as a strong starting point for creating new Korean chat and instruction models, despite not being fine-tuned with a dedicated Korean instruction set in this preview version.

Loading preview...

Model Overview

beomi/Llama-3-Open-Ko-8B-Instruct-preview is an 8 billion parameter instruction-tuned language model developed by beomi. It is built upon the Llama-3-8B architecture and has undergone continued pre-training using publicly available datasets, comprising over 60GB of deduplicated texts. The model leverages the new Llama-3 tokenizer, with pre-training conducted on more than 17.7 billion tokens, surpassing the token count used for previous Llama-2-Ko models.

Key Characteristics

  • Base Model: Llama-3-8B architecture.
  • Pre-training Data: Over 60GB of deduplicated public texts.
  • Tokenizer: Utilizes the new Llama-3 tokenizer, with 17.7B+ pre-training tokens.
  • Instruction Tuning: This is a preview instruction model, applying concepts from the Chat Vector paper, but it is explicitly noted as not fine-tuned with a dedicated Korean instruction set.
  • Development Support: Training was conducted on TPUv5e-256 with support from the Google TRC program.

Intended Use

This model is designed as an excellent foundational model for developers looking to create new Korean chat and instruction-following models. Its "preview" status indicates it's a strong base for further fine-tuning with specific Korean instruction datasets to achieve optimal performance for various applications.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p