beomi/Llama-3-Open-Ko-8B-Instruct-preview
beomi/Llama-3-Open-Ko-8B-Instruct-preview is an 8 billion parameter instruction-tuned language model developed by beomi, based on the Llama-3-8B architecture. This model is continued pre-trained using publicly available resources, including over 60GB of deduplicated Korean texts, and utilizes the new Llama-3 tokenizer with 17.7B+ tokens. It serves as a strong starting point for creating new Korean chat and instruction models, despite not being fine-tuned with a dedicated Korean instruction set in this preview version.
Loading preview...
Model Overview
beomi/Llama-3-Open-Ko-8B-Instruct-preview is an 8 billion parameter instruction-tuned language model developed by beomi. It is built upon the Llama-3-8B architecture and has undergone continued pre-training using publicly available datasets, comprising over 60GB of deduplicated texts. The model leverages the new Llama-3 tokenizer, with pre-training conducted on more than 17.7 billion tokens, surpassing the token count used for previous Llama-2-Ko models.
Key Characteristics
- Base Model: Llama-3-8B architecture.
- Pre-training Data: Over 60GB of deduplicated public texts.
- Tokenizer: Utilizes the new Llama-3 tokenizer, with 17.7B+ pre-training tokens.
- Instruction Tuning: This is a preview instruction model, applying concepts from the Chat Vector paper, but it is explicitly noted as not fine-tuned with a dedicated Korean instruction set.
- Development Support: Training was conducted on TPUv5e-256 with support from the Google TRC program.
Intended Use
This model is designed as an excellent foundational model for developers looking to create new Korean chat and instruction-following models. Its "preview" status indicates it's a strong base for further fine-tuning with specific Korean instruction datasets to achieve optimal performance for various applications.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.