Duxiaoman-DI/XuanYuan2-70B-Chat

TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:Feb 4, 2024License:llama2Architecture:Transformer0.0K Open Weights Cold

The XuanYuan2-70B-Chat model, developed by Duxiaoman-DI, is a 70 billion parameter instruction-tuned language model. It is a successor to XuanYuan-70B, enhanced through continued pre-training with high-quality data, instruction fine-tuning, and reinforcement learning with human feedback (RLHF). This model significantly improves general capabilities, safety, and financial domain performance, and supports an extended context length of 16k tokens.

Loading preview...

XuanYuan2-70B-Chat: Enhanced General and Financial LLM

Duxiaoman-DI's XuanYuan2-70B-Chat is a 70 billion parameter instruction-tuned model, building upon the XuanYuan-70B base. It has undergone extensive continued pre-training with high-quality data, instruction fine-tuning, and reinforcement learning with human feedback (RLHF) to align better with human preferences.

Key Enhancements & Capabilities:

  • Improved General & Financial Performance: Significant advancements in overall capabilities, safety, and specialized financial domain understanding compared to its predecessor.
  • Extended Context Length: Supports a 16k token context window, enabling better processing of longer texts.
  • Optimized Training: Utilizes a novel data-bucketed dynamic pre-training method for efficiency and long-text modeling.
  • Quantized Versions: Available in 8-bit and 4-bit quantized versions to reduce hardware requirements, with 4-bit models showing substantial throughput gains (60.32 token/s) when used with vLLM.

Performance Highlights:

  • General Benchmarks: Achieves 72.7 on CEVAL and 72.7 on CMMLU, demonstrating enhanced Chinese language capabilities while maintaining English performance (70.8 MMLU).
  • Financial Benchmarks: Scores an average of 67.83 on the FinanceIQ benchmark, indicating improved financial knowledge and reasoning.

Ideal Use Cases:

  • Applications requiring strong general language understanding and generation.
  • Financial industry applications needing specialized knowledge and analysis.
  • Scenarios benefiting from longer context processing.
  • Deployments where optimized inference speed with quantized models (especially via vLLM) is crucial.