Saxo/Linkbricks-Horizon-AI-Japanese-Superb-V4-70B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Saxo/Linkbricks-Horizon-AI-Japanese-Superb-V4-70B is a 70 billion parameter Japanese-enhanced language model developed by Saxo (Yunsung Ji) of Linkbricks Horizon-AI. Fine-tuned using SFT and DPO on 30 million Japanese news and wiki corpora, it excels in cross-lingual tasks (Japanese, Korean, Chinese, English) and complex logical reasoning. The model is particularly strong in high-dimensional analysis of customer reviews, social posts, coding, writing, mathematics, and logical decision-making, featuring a 128k context window and Function Calling support.

Loading preview...

Model Overview

Saxo/Linkbricks-Horizon-AI-Japanese-Superb-V4-70B is a 70 billion parameter language model developed by Saxo (Yunsung Ji), a data scientist and CEO at Linkbricks Horizon-AI. This model is an enhanced version, fine-tuned from the Saxo/Linkbricks-Horizon-AI-Japanese-Superb-V3-70B base model using SFT (Supervised Fine-Tuning) and DPO (Direct Preference Optimization) techniques on 8 H100-80G GPUs.

Key Capabilities & Training

  • Japanese Language Enhancement: Trained extensively on 30 million Japanese news and wiki corpora.
  • Cross-Lingual Proficiency: Utilizes cross-training data for Japanese, Korean, Chinese, and English, enabling robust performance across these languages.
  • Advanced Reasoning: Specifically trained with mathematical and logical judgment data to handle complex logical problems.
  • Extended Context Window: Features a 128k context window, allowing for processing longer inputs and maintaining coherence over extended conversations.
  • Function Calling: Supports Function Call and Tool Calling, enhancing its utility for integration with external systems and complex task execution.
  • Specialized Analysis: Enhanced for high-dimensional analysis of customer reviews and social posts.
  • Core Skills: Demonstrates strengthened capabilities in coding, writing, mathematics, and logical decision-making.

Technical Details

  • Tokenizer: Uses the base model's tokenizer without word expansion.
  • Training Methods: Employs Deepspeed Stage=3, rslora, and BAdam Layer Mode during training.

Use Cases

This model is particularly well-suited for applications requiring strong Japanese language understanding and generation, cross-lingual processing, complex logical problem-solving, and advanced analytical tasks in areas like customer feedback analysis and content creation.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p