Overview
XuanYuan-70B-Chat: A Financial-Optimized Llama2-70B Extension
XuanYuan-70B-Chat is a 70 billion parameter instruction-tuned model developed by Duxiaoman-DI, based on the Llama2-70B architecture. Its primary goal is to maintain strong general language capabilities while significantly enhancing performance in the financial domain.
Key Capabilities & Features
- Extended Context Length: Features an 8192-token context window, expanded during pre-training from Llama2's original 4k, making it suitable for long-text financial business scenarios.
- Financial Domain Enhancement: Underwent incremental pre-training with a substantial volume of high-quality Chinese and English data, including a 9:1 ratio of general to financial corpora, gradually increasing financial data proportion.
- Instruction Tuning: Utilizes a two-stage instruction fine-tuning process, combining 10M open-source multilingual instructions with 200k self-developed, high-quality general and financial instructions (4:1 ratio).
- Optimized Training: Achieves high training efficiency through various acceleration strategies, including optimized data loading, distributed training frameworks, Flash Attention 2, and CPP CUDA-based Fused operators.
- Quantization Support: Offers 8-bit and 4-bit quantized versions (XuanYuan-70B-Chat-8bit and XuanYuan-70B-Chat-4bit) to reduce memory requirements, with 8-bit quantization showing near-lossless performance.
Good For
- Applications requiring strong general conversational abilities in Chinese.
- Financial industry use cases, including financial news analysis, company announcements, and financial Q&A.
- Scenarios demanding processing of long financial texts due to its extended context window.
- Developers seeking a powerful 70B model with optimized performance for both general and specialized financial tasks.