Overview
Model Overview
Saxo/Linkbricks-Horizon-AI-Korean-Pro-27B is a 27 billion parameter Korean language model developed by Linkbricks Horizon-AI, led by data scientist Yunsung Ji (Saxo). It is built upon the gemma-2-27b-it base model and has undergone extensive Continued Pre-training (CPT), Supervised Fine-tuning (SFT), and Direct Preference Optimization (DPO) using 8 H100-80G GPUs.
Key Capabilities
- Multilingual Processing: Trained with 90 million Korean news corpus and cross-training data for Korean, Chinese, English, and Japanese, enabling robust performance across these languages.
- Advanced Reasoning: Enhanced to handle complex logical problems and mathematical tasks.
- High-Dimensional Analysis: Optimized for sophisticated analysis of customer reviews and social media posts.
- Content Generation: Strong capabilities in coding and creative writing.
- Technical Implementation: Utilizes Deepspeed Stage=3, rslora, and BAdam Layer Mode for efficient training. The tokenizer from the base model is used without word expansion.
Good For
- Applications requiring deep understanding and generation of Korean text.
- Tasks involving cross-lingual processing between Korean, Chinese, English, and Japanese.
- Use cases demanding strong logical reasoning and mathematical problem-solving.
- Analyzing and summarizing large volumes of customer feedback and social media data.
- Developing applications that require code generation or creative writing in Korean.