SUSTech/SUS-Chat-34B
SUS-Chat-34B is a 34 billion parameter bilingual Chinese-English dialogue model developed by Southern University of Science and Technology and IDEA-CCNL. Based on 01-ai/Yi-34B, it features high-quality instruction fine-tuning on millions of multilingual data, an expanded 8K context window, and excels in multi-turn dialogues and complex multilingual tasks. It demonstrates strong performance across various benchmarks, often surpassing models of similar scale and competing with larger models, particularly in Chinese language understanding and mathematical reasoning.
Loading preview...
SUS-Chat-34B: Instruction Tuning for Bilingual Dialogue
SUS-Chat-34B is a 34 billion parameter bilingual (Chinese-English) dialogue model, a collaborative effort by the Southern University of Science and Technology and IDEA-CCNL. Built upon the 01-ai/Yi-34B base model, SUS-Chat-34B has undergone extensive instruction fine-tuning using millions of high-quality, multilingual instruction data.
Key Capabilities
- Bilingual Proficiency: Excels in both Chinese and English, with strong performance across mainstream tasks.
- Enhanced Instruction Following: Improved response to human instructions, particularly in imitating human thought processes through chains of thought.
- Extended Context Window: Features an 8K context window, expanded from 4K, significantly enhancing multi-turn dialogue usability through inter-instruction attention sharing.
- Strong General Performance: Outperforms other open-source instruction-tuned models of similar parameter scale in numerous benchmarks, including MMLU, CMMLU, C-Eval, BBH, GSM-8K, and MATH.
- High-Quality Training Data: Trained with 1.4 billion tokens of complex instruction data covering multi-turn dialogues, mathematics, and reasoning.
Good for
- Complex Multilingual Tasks: Designed to meet the practical needs of intricate Chinese and English language tasks.
- Multi-turn Dialogue Systems: Its 8K context window and training on multi-turn data make it highly effective for sustained conversations.
- Academic Research: Demonstrates how effective instruction fine-tuning can achieve strong performance using open-source datasets and models, bridging the gap between academia and industry.
- Reasoning and Mathematical Applications: Shows competitive performance in mathematical and reasoning benchmarks like GSM-8K and MATH.