Name: Duxiaoman-DI/XuanYuan-70B API
Brand: Featherless.ai
Price: 25.00 USD
Availability: InStock
Author: Duxiaoman-DI

Overview

XuanYuan-70B is a 70 billion parameter large language model developed by Duxiaoman-DI, built upon the Llama2-70B architecture. It undergoes extensive Chinese language enhancement and incremental pre-training with a mix of general and financial corpora. A key innovation is its extended context length of 8192 tokens, achieved during the pre-training phase, addressing the need for long-text processing in financial scenarios. The model is available in base and chat versions, with 8-bit and 4-bit quantized chat models also provided to reduce hardware requirements.

Key Capabilities

Financial Domain Specialization: Enhanced performance in financial tasks through targeted pre-training data (financial news, company announcements, financial encyclopedias, etc.) and instruction tuning.
Extended Context Window: Supports an 8192-token context length, crucial for analyzing lengthy financial documents and complex business scenarios.
Multilingual Support: Maintains strong English capabilities from its Llama2 base while significantly improving Chinese language proficiency through a 3:1 Chinese-English data ratio in incremental pre-training.
Instruction Following: The chat model is fine-tuned using a two-stage process, combining 10M open-source multilingual instruction data with 200K high-quality, self-developed general and financial instruction data.
Quantized Versions: Offers 8-bit and 4-bit quantized chat models, significantly reducing VRAM requirements (e.g., 4-bit model requires 35GB VRAM compared to 129GB for the full model) with minimal performance degradation.

Good for

Financial Analysis: Ideal for applications requiring deep understanding and generation of financial texts, reports, and market analysis.
Long Document Processing: Suitable for tasks involving lengthy financial documents, contracts, or research papers due to its 8K context window.
Multilingual Financial Chatbots: Can power chatbots and conversational AI systems that need to operate in both Chinese and English within the financial sector.
Resource-Constrained Deployment: The 8-bit and 4-bit quantized versions enable deployment on hardware with limited GPU memory, making it accessible for a wider range of users and applications.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)