Duxiaoman-DI/XuanYuan-70B

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Sep 22, 2023License:llama2Architecture:Transformer0.0K Open Weights Warm

XuanYuan-70B is a 70 billion parameter large language model developed by Duxiaoman-DI, based on the Llama2-70B architecture. It features significant Chinese language enhancement and is specifically optimized for financial domain applications, maintaining strong general capabilities. Notably, it extends the context length to 8192 tokens during pre-training, making it one of the first 70B models with such an extended context window. The model is designed to serve financial industry use cases while preserving robust general-purpose language understanding.

Loading preview...

Overview

XuanYuan-70B is a 70 billion parameter large language model developed by Duxiaoman-DI, built upon the Llama2-70B architecture. It undergoes extensive Chinese language enhancement and incremental pre-training with a mix of general and financial corpora. A key innovation is its extended context length of 8192 tokens, achieved during the pre-training phase, addressing the need for long-text processing in financial scenarios. The model is available in base and chat versions, with 8-bit and 4-bit quantized chat models also provided to reduce hardware requirements.

Key Capabilities

  • Financial Domain Specialization: Enhanced performance in financial tasks through targeted pre-training data (financial news, company announcements, financial encyclopedias, etc.) and instruction tuning.
  • Extended Context Window: Supports an 8192-token context length, crucial for analyzing lengthy financial documents and complex business scenarios.
  • Multilingual Support: Maintains strong English capabilities from its Llama2 base while significantly improving Chinese language proficiency through a 3:1 Chinese-English data ratio in incremental pre-training.
  • Instruction Following: The chat model is fine-tuned using a two-stage process, combining 10M open-source multilingual instruction data with 200K high-quality, self-developed general and financial instruction data.
  • Quantized Versions: Offers 8-bit and 4-bit quantized chat models, significantly reducing VRAM requirements (e.g., 4-bit model requires 35GB VRAM compared to 129GB for the full model) with minimal performance degradation.

Good for

  • Financial Analysis: Ideal for applications requiring deep understanding and generation of financial texts, reports, and market analysis.
  • Long Document Processing: Suitable for tasks involving lengthy financial documents, contracts, or research papers due to its 8K context window.
  • Multilingual Financial Chatbots: Can power chatbots and conversational AI systems that need to operate in both Chinese and English within the financial sector.
  • Resource-Constrained Deployment: The 8-bit and 4-bit quantized versions enable deployment on hardware with limited GPU memory, making it accessible for a wider range of users and applications.