Overview
Overview
XuanYuan-70B is a 70 billion parameter large language model developed by Duxiaoman-DI, built upon the Llama2-70B architecture. It undergoes extensive Chinese language enhancement and incremental pre-training with a mix of general and financial corpora. A key innovation is its extended context length of 8192 tokens, achieved during the pre-training phase, addressing the need for long-text processing in financial scenarios. The model is available in base and chat versions, with 8-bit and 4-bit quantized chat models also provided to reduce hardware requirements.
Key Capabilities
- Financial Domain Specialization: Enhanced performance in financial tasks through targeted pre-training data (financial news, company announcements, financial encyclopedias, etc.) and instruction tuning.
- Extended Context Window: Supports an 8192-token context length, crucial for analyzing lengthy financial documents and complex business scenarios.
- Multilingual Support: Maintains strong English capabilities from its Llama2 base while significantly improving Chinese language proficiency through a 3:1 Chinese-English data ratio in incremental pre-training.
- Instruction Following: The chat model is fine-tuned using a two-stage process, combining 10M open-source multilingual instruction data with 200K high-quality, self-developed general and financial instruction data.
- Quantized Versions: Offers 8-bit and 4-bit quantized chat models, significantly reducing VRAM requirements (e.g., 4-bit model requires 35GB VRAM compared to 129GB for the full model) with minimal performance degradation.
Good for
- Financial Analysis: Ideal for applications requiring deep understanding and generation of financial texts, reports, and market analysis.
- Long Document Processing: Suitable for tasks involving lengthy financial documents, contracts, or research papers due to its 8K context window.
- Multilingual Financial Chatbots: Can power chatbots and conversational AI systems that need to operate in both Chinese and English within the financial sector.
- Resource-Constrained Deployment: The 8-bit and 4-bit quantized versions enable deployment on hardware with limited GPU memory, making it accessible for a wider range of users and applications.