Duxiaoman-DI/XuanYuan-70B

Warm
Public
70B
FP8
8192
Sep 22, 2023
License: llama2
Hugging Face
Overview

Overview

XuanYuan-70B is a 70 billion parameter large language model developed by Duxiaoman-DI, built upon the Llama2-70B architecture. It undergoes extensive Chinese language enhancement and incremental pre-training with a mix of general and financial corpora. A key innovation is its extended context length of 8192 tokens, achieved during the pre-training phase, addressing the need for long-text processing in financial scenarios. The model is available in base and chat versions, with 8-bit and 4-bit quantized chat models also provided to reduce hardware requirements.

Key Capabilities

  • Financial Domain Specialization: Enhanced performance in financial tasks through targeted pre-training data (financial news, company announcements, financial encyclopedias, etc.) and instruction tuning.
  • Extended Context Window: Supports an 8192-token context length, crucial for analyzing lengthy financial documents and complex business scenarios.
  • Multilingual Support: Maintains strong English capabilities from its Llama2 base while significantly improving Chinese language proficiency through a 3:1 Chinese-English data ratio in incremental pre-training.
  • Instruction Following: The chat model is fine-tuned using a two-stage process, combining 10M open-source multilingual instruction data with 200K high-quality, self-developed general and financial instruction data.
  • Quantized Versions: Offers 8-bit and 4-bit quantized chat models, significantly reducing VRAM requirements (e.g., 4-bit model requires 35GB VRAM compared to 129GB for the full model) with minimal performance degradation.

Good for

  • Financial Analysis: Ideal for applications requiring deep understanding and generation of financial texts, reports, and market analysis.
  • Long Document Processing: Suitable for tasks involving lengthy financial documents, contracts, or research papers due to its 8K context window.
  • Multilingual Financial Chatbots: Can power chatbots and conversational AI systems that need to operate in both Chinese and English within the financial sector.
  • Resource-Constrained Deployment: The 8-bit and 4-bit quantized versions enable deployment on hardware with limited GPU memory, making it accessible for a wider range of users and applications.