Saxo/Linkbricks-Horizon-AI-Korean-Advanced-70B

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kPublished:Sep 4, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The Linkbricks-Horizon-AI-Korean-Advanced-70B is a 70 billion parameter Korean language model developed by Yunsung Ji (Saxo) at Linkbricks. It is based on the Hermes-3-Llama-3.1-70B architecture and has undergone continued pre-training, supervised fine-tuning, and DPO using 10 million Korean news corpora. This model is specifically enhanced for high-dimensional analysis of customer reviews and social posts, coding, writing, mathematics, and complex logical reasoning, featuring a 128k context window and support for Korean Function Call and Tool Calling.

Loading preview...

Linkbricks-Horizon-AI-Korean-Advanced-70B Overview

Developed by Yunsung Ji (Saxo), a data scientist at Linkbricks, this model is a 70 billion parameter Korean language model built upon the Hermes-3-Llama-3.1-70B base. It underwent extensive continued pre-training (CPT), supervised fine-tuning (SFT), and DPO using 8 H100-80G GPUs, with approximately 20% of its parameters specifically trained for Korean.

Key Capabilities & Enhancements

  • Multilingual Cross-Training: Utilizes 10 million Korean news corpora and cross-training data for Korean, Chinese, English, and Japanese, enhancing its ability to handle complex logical and mathematical problems across these languages.
  • Extended Context Window: Features a 128k-Context Window, allowing for processing and understanding longer sequences of text.
  • Specialized Task Performance: Significantly enhanced for:
    • High-dimensional analysis of customer reviews and social posts.
    • Coding and writing tasks.
    • Mathematical problem-solving and logical reasoning.
  • Korean Function & Tool Calling: Supports native Korean Function Call and Tool Calling, facilitating integration into various applications.
  • Training Methodology: Employs advanced techniques such as Deepspeed Stage=3, rslora, and BAdam Layer Mode for efficient and effective training.

Use Cases

This model is particularly well-suited for applications requiring deep understanding and generation of Korean text, especially in domains involving complex data analysis, programming assistance, content creation, and advanced logical reasoning.