Saxo/Linkbricks-Horizon-AI-Nous-Hermes-3-Llama3.1-Korean-cpt-8b

Warm
Public
8B
FP8
32768
Sep 17, 2024
License: apache-2.0
Hugging Face
Overview

Saxo/Linkbricks-Horizon-AI-Nous-Hermes-3-Llama3.1-Korean-cpt-8b Overview

This model is a specialized Korean language model developed by Dr. Yunsung Ji (Saxo), a data scientist at Linkbricks. It is built upon the robust NousResearch/Hermes-3-Llama-3.1-8B base model and has undergone extensive Continued Pre-Training (CPT) using 8 H100-80G GPUs.

Key Capabilities & Features

  • Korean Language Optimization: Approximately 25% of the model's total parameters were re-tuned using a diverse Korean corpus, including 50 million Korean news articles, to enhance its proficiency in the Korean language.
  • Extended Context Window: Features a substantial 128k-Context Window, enabling the processing and understanding of very long Korean texts.
  • Function and Tool Calling: Supports Korean Function Call and Tool Calling, allowing for integration with external tools and APIs for more complex tasks.
  • Training Methodology: Utilizes advanced training techniques such as Deepspeed Stage=3, rslora, and BAdam Layer Mode for efficient and effective CPT.
  • Base Tokenizer: The tokenizer from the base model is used without expansion, maintaining compatibility while focusing on parameter re-tuning for Korean.

When to Use This Model

This model is ideal for applications requiring high-performance Korean language understanding and generation, especially those benefiting from:

  • Long-form Korean text processing: Due to its 128k context window.
  • Integration with external systems: Through its Korean Function Call and Tool Calling capabilities.
  • Further fine-tuning: Designed as a strong Korean base model for subsequent Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) to suit specific use cases.