xiangxinai/Xiangxin-2XL-Chat-1048k-Chinese-Llama3-70B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:May 21, 2024License:llama3Architecture:Transformer0.0K Warm

The Xiangxin-2XL-Chat-1048k is a 70 billion parameter chat model developed by Xiangxin AI, based on Meta Llama-3-70B-Instruct and Gradient AI's expanded context work. It features an extended context length of 1 million words and is specifically fine-tuned with a proprietary Chinese value-aligned dataset using ORPO training. This model is optimized for enhanced Chinese language proficiency and cultural alignment, achieving an average score of 70.22 across eight benchmarks, surpassing its base model.

Loading preview...

Overview

Xiangxin-2XL-Chat-1048k is a 70 billion parameter chat model developed by Xiangxin AI. It is built upon the Meta Llama-3-70B-Instruct model, incorporating context expansion techniques from Gradient AI to achieve an impressive 1 million word context length.

Key Capabilities & Training

This model is specifically designed for the Chinese language, having been fine-tuned with a proprietary Chinese value-aligned dataset using ORPO training. This process enhances its Chinese proficiency and cultural alignment. The training data did not include any evaluation datasets.

Performance

Xiangxin-2XL-Chat-1048k demonstrates strong performance, achieving an average score of 70.22 across eight benchmarks, including ARC, HellaSwag, MMLU, TruthfulQA_mc2, Winogrande, GSM8K_flex, CMMLU, and C-EVAL. This score surpasses the Llama-3-70B-Instruct-Gradient-1048k model, which scored 69.66 on the same evaluations.

Use Cases

This model is particularly well-suited for applications requiring:

  • Extended context understanding in Chinese.
  • Culturally aligned responses for Chinese users.
  • General chat and instruction-following tasks with a focus on Chinese language nuances.