shenzhi-wang/Llama3.1-8B-Chinese-Chat

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jul 24, 2024License:llama3.1Architecture:Transformer0.3K Warm

shenzhi-wang/Llama3.1-8B-Chinese-Chat is an 8 billion parameter instruction-tuned language model developed by Shenzhi Wang and Yaowei Zheng, built upon Meta-Llama-3.1-8B-Instruct. It is specifically fine-tuned for Chinese and English users, excelling in roleplay, function calling, and mathematical capabilities. The model supports a context length of 128K tokens and is optimized for diverse conversational applications.

Loading preview...

Model Overview

shenzhi-wang/Llama3.1-8B-Chinese-Chat is an 8 billion parameter instruction-tuned language model, developed by Shenzhi Wang and Yaowei Zheng, designed for both Chinese and English users. It is built upon the robust Meta-Llama-3.1-8B-Instruct base model and utilizes the ORPO fine-tuning algorithm. The model's training dataset includes over 100,000 preference pairs, leading to significant enhancements in specific areas.

Key Capabilities

  • Enhanced Roleplay: Demonstrates improved performance in role-playing scenarios.
  • Function Calling: Exhibits strong capabilities in function calling tasks.
  • Mathematical Reasoning: Shows significant improvements in handling mathematical problems.
  • Multilingual Support: Optimized for both Chinese and English language users.
  • Extended Context: Inherits the 128K token context length from its base model, though specifically untested for the Chinese model.

Training Details

The model was fine-tuned using the LLaMA-Factory framework over 3 epochs with full parameter tuning. It employs a cosine learning rate scheduler with a warmup ratio of 0.1 and an ORPO beta of 0.05. The cutoff length for training was 8192 tokens, with a global batch size of 128.

Good for

This model is particularly well-suited for applications requiring advanced conversational abilities, especially those involving role-playing, precise function execution, and mathematical problem-solving in both Chinese and English contexts. Its availability in official q4_k_m, q8_0, and f16 GGUF versions also makes it suitable for local deployment and inference using tools like LM Studio or llama.cpp.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p