Name: masterjae/T-Llama-3-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: masterjae

T-Llama-3-8B: Korean Continual Pre-trained Model

T-Llama-3-8B is an 8 billion parameter language model developed by TmaxAI, continually pre-trained on the Meta-Llama-3-8B architecture. It was trained on a substantial 18.8 billion tokens from a 64GB Korean corpus, enhancing its proficiency in the Korean language. The model supports an 8192-token sequence length, making it suitable for processing longer texts.

Key Capabilities

Enhanced Korean Language Understanding: Specialized training on a large Korean corpus significantly improves its performance for Korean-centric tasks.
Multilingual Translation: Demonstrates strong capabilities in multilingual translation (MMT), outperforming the base Llama-3-8B and achieving competitive results against other leading Korean continuous pretraining models, particularly in EN<->KO, KO<->EN, JA<->KO, KO<->JA, ZH<->KO, and KO<->ZH directions.
Optimized Performance: Integrates advanced optimization techniques including DeepSpeed, Gradient Checkpointing, Flash_Attention_2, and BF16 Mixed Precision for efficient training and inference.

Good For

Applications requiring robust Korean language processing.
Multilingual translation tasks involving Korean.
Researchers and developers looking for a high-performance, continually pre-trained Korean LLM.

Overview

T-Llama-3-8B: Korean Continual Pre-trained Model

Key Capabilities

Good For

Full Model Card (README)