shenzhi-wang/Llama3-70B-Chinese-Chat

Hugging Face
TEXT GENERATIONConcurrency Cost:4Model Size:70BQuant:FP8Ctx Length:8kPublished:May 7, 2024License:llama3Architecture:Transformer0.1K Warm

The shenzhi-wang/Llama3-70B-Chinese-Chat is a 70.6 billion parameter instruction-tuned language model developed by Shenzhi Wang, Yaowei Zheng, Guoyin Wang, Shiji Song, and Gao Huang. Built upon Meta-Llama-3-70B-Instruct, it is fine-tuned on a mixed Chinese-English dataset of over 100K preference pairs, excelling in Chinese performance, roleplaying, tool-using, and mathematical tasks. This model significantly reduces Chinese-English mixing issues and offers a context length of 8192 tokens, making it suitable for complex multilingual applications.

Loading preview...

Overview

shenzhi-wang/Llama3-70B-Chinese-Chat is a 70.6 billion parameter instruction-tuned language model, developed by Shenzhi Wang and collaborators, based on Meta-Llama-3-70B-Instruct. It is one of the first LLMs specifically fine-tuned for Chinese and English users, addressing issues like "Chinese questions with English answers" and mixed language responses prevalent in other models. The model was trained using the ORPO algorithm on a dataset of over 100K mixed Chinese-English preference pairs, with a context length of 8192 tokens.

Key Capabilities

  • Superior Chinese Performance: Benchmarks on C-Eval and CMMLU show its Chinese performance significantly surpasses ChatGPT and is comparable to GPT-4.
  • Multilingual Proficiency: Greatly reduces Chinese-English mixing issues, providing more coherent responses in both languages.
  • Diverse Abilities: Excels in roleplaying, function calling (tool-using), and mathematical problem-solving.
  • Full-Parameter Fine-tuning: Utilizes full-parameter fine-tuning for enhanced performance over its base model.

Good For

  • Applications requiring high-quality Chinese language generation and understanding.
  • Use cases involving complex roleplaying scenarios.
  • Integrating with tools via function calling.
  • Solving mathematical problems and logical reasoning tasks.
  • Developers seeking a powerful, multilingual LLM with strong performance in both Chinese and English.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p