Overview

Xiangxin-2XL-Chat-1048k is a 70 billion parameter chat model developed by Xiangxin AI. It is built upon the Meta Llama-3-70B-Instruct model, incorporating context expansion techniques from Gradient AI to achieve an impressive 1 million word context length.

Key Capabilities & Training

This model is specifically designed for the Chinese language, having been fine-tuned with a proprietary Chinese value-aligned dataset using ORPO training. This process enhances its Chinese proficiency and cultural alignment. The training data did not include any evaluation datasets.

Performance

Xiangxin-2XL-Chat-1048k demonstrates strong performance, achieving an average score of 70.22 across eight benchmarks, including ARC, HellaSwag, MMLU, TruthfulQA_mc2, Winogrande, GSM8K_flex, CMMLU, and C-EVAL. This score surpasses the Llama-3-70B-Instruct-Gradient-1048k model, which scored 69.66 on the same evaluations.

Use Cases

This model is particularly well-suited for applications requiring:

Extended context understanding in Chinese.
Culturally aligned responses for Chinese users.
General chat and instruction-following tasks with a focus on Chinese language nuances.

Overview

Overview

Key Capabilities & Training

Performance

Use Cases

Full Model Card (README)