Rookie/Llama-3-8B-Instruct-Chinese

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 22, 2024Architecture:Transformer0.0K Warm

Rookie/Llama-3-8B-Instruct-Chinese is an 8 billion parameter instruction-tuned causal language model, fine-tuned from Llama-3-8B-Instruct specifically for Chinese language tasks. Developed by Rookie, this model excels in Chinese multi-turn dialogue, general NLP tasks, and mathematical reasoning. It leverages diverse Chinese datasets including firefly-train-1.1M, moss-003-sft-data, and school_math_0.25M to enhance its understanding and generation capabilities in Chinese contexts.

Loading preview...

Rookie/Llama-3-8B-Instruct-Chinese Overview

Rookie/Llama-3-8B-Instruct-Chinese is an 8 billion parameter instruction-tuned language model, a specialized version of Llama-3-8B-Instruct, meticulously fine-tuned for the Chinese language. This model focuses on improving performance across various Chinese NLP tasks and conversational abilities.

Key Capabilities

  • Enhanced Chinese Dialogue: Optimized for natural and coherent multi-turn conversations in Chinese, as demonstrated by its ability to handle complex queries and maintain context.
  • Diverse Task Proficiency: Trained on a rich collection of Chinese datasets, enabling it to perform well in tasks such as poetry generation, classical Chinese translation, and general question-answering.
  • Mathematical Reasoning: Incorporates the school_math_0.25M dataset, providing it with capabilities for mathematical problem-solving.
  • Cultural Nuance: Includes data specifically designed to incorporate elements of Chinese culture, such as couplets and classical literature.

Training Details

The model was fine-tuned using a curated set of Chinese datasets:

  • firefly-train-1.1M: Contains 1.15 million entries covering 23 common Chinese NLP tasks, with a focus on high-quality, human-written instruction templates and cultural content.
  • moss-003-sft-data: A large-scale Chinese and English multi-turn dialogue dataset with over 1 million entries.
  • school_math_0.25M: Comprises 250,000 mathematical operation instructions.
  • ruozhiba: A dataset of "weak intelligence bar" questions, aimed at improving the model's cognitive abilities.

Good For

  • Applications requiring robust Chinese conversational AI.
  • Developers building tools for Chinese NLP tasks, including content generation and translation.
  • Educational platforms needing assistance with Chinese mathematical problems.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p