Bllossom/llama-3-Korean-Bllossom-70B Overview
Bllossom/llama-3-Korean-Bllossom-70B is a 70.8 billion parameter Korean-English bilingual language model built upon the Llama-3 architecture. Developed through a collaboration between MLPLab at Seoultech, Teddysum, and Yonsei University, this model is specifically designed to enhance Korean language processing capabilities while maintaining strong English proficiency.
Key Capabilities
- Extensive Korean Vocabulary Expansion: Features over 30,000 expanded Korean vocabulary words, significantly improving Korean expressiveness.
- Enhanced Korean Context Handling: Processes approximately 25% longer Korean context lengths compared to the base Llama-3 model.
- Bilingual Knowledge Linking: Utilizes Korean-English parallel corpora for pre-training to establish robust knowledge connections between the two languages.
- Culturally Aligned Instruction Tuning: Fine-tuned with custom instruction-following data tailored to Korean language nuances and cultural contexts, developed by linguists.
- Reinforcement Learning (DPO): Incorporates Human Feedback (DPO) for improved model performance and alignment.
- Vision-Language Alignment: The Bllossom project also includes a vision-language model (Bllossom-V) that aligns vision transformers with this language model.
Ideal Use Cases
- Applications requiring high-quality Korean language generation and understanding.
- Bilingual Korean-English tasks, leveraging its knowledge linking capabilities.
- Developing custom models that benefit from a strong Korean linguistic foundation and commercial use is permitted.