Saxo/Linkbricks-Horizon-AI-Korean-llama3.1-sft-rlhf-dpo-8B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Aug 9, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Saxo/Linkbricks-Horizon-AI-Korean-llama3.1-sft-rlhf-dpo-8B is an 8 billion parameter Korean language model developed by Linkbricks Horizon-AI, fine-tuned from NousResearch/Meta-Llama-3.1-8B-Instruct. It utilizes SFT, RLHF, and DPO techniques with Korean-Chinese-English-Japanese cross-training data to enhance logical problem-solving in Korean. The model features a 32768-token context window and is specifically strengthened for high-level analysis of customer reviews, social media postings, and coding tasks.

Loading preview...

Model Overview

Saxo/Linkbricks-Horizon-AI-Korean-llama3.1-sft-rlhf-dpo-8B is an 8 billion parameter Korean language model developed by Linkbricks Horizon-AI. It is built upon the NousResearch/Meta-Llama-3.1-8B-Instruct base model and has undergone a rigorous fine-tuning process involving Supervised Fine-Tuning (SFT), Reinforcement Learning from Human Feedback (RLHF), and Direct Preference Optimization (DPO) on KT-CLOUD using H100-80G GPUs.

Key Capabilities

  • Multilingual Logical Reasoning: Enhanced with Korean-Chinese-English-Japanese cross-training data and logical data, enabling it to handle complex Korean logical problems.
  • Extended Context Window: Supports a 32768-token context window, allowing for processing longer inputs and maintaining conversational coherence.
  • Specialized Analysis: Strengthened for high-level analysis of customer reviews and social media postings.
  • Coding Proficiency: Improved capabilities in code generation and understanding.
  • Tool Calling Support: Includes support for tool calling functionalities.

Technical Details

The model utilizes advanced training techniques such as Deepspeed Stage 3, rslora, and flash attention 2. The tokenizer remains consistent with the base model, without word expansion, ensuring compatibility and efficient processing. This model is particularly suited for applications requiring robust Korean language understanding, complex logical reasoning, and specialized text analysis in a multilingual context.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p