modelscope/Llama3-Chinese-8B-Instruct-Agent-v1

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 23, 2024License:llama3Architecture:Transformer0.0K Cold

modelscope/Llama3-Chinese-8B-Instruct-Agent-v1 is an 8 billion parameter instruction-tuned language model developed by ModelScope, based on the Llama3-8B-Instruct architecture. It is specifically adapted for general Chinese language scenarios and supports ReACT-formatted agent calls. This model excels in Chinese natural language understanding and agentic reasoning, making it suitable for applications requiring interactive AI capabilities in Chinese contexts.

Loading preview...

Overview

modelscope/Llama3-Chinese-8B-Instruct-Agent-v1 is an 8 billion parameter instruction-tuned model built upon the Llama3-8B-Instruct base. Developed by ModelScope, this model is specifically enhanced for general Chinese language applications and features support for ReACT-formatted agent interactions.

Key Capabilities

  • Chinese Language Adaptation: Fine-tuned with a diverse mix of Chinese datasets, including COIG-CQIA (covering traditional Chinese knowledge, social media, and Q&A platforms) and the ModelScope general Chinese Q&A dataset (ms-bench).
  • Agentic Reasoning: Designed to support ReACT-style agent calls, enabling more complex, multi-turn interactions and tool use within agent frameworks like ModelScopeAgent.
  • Training Details: The model was trained for 2 epochs with a learning rate of 5e-5, utilizing LoRA (rank 8, alpha 32) for efficient fine-tuning on a mixed dataset including Chinese and English instruction data (alpaca-en).

Performance Considerations

While optimized for Chinese and agent scenarios, the model's performance on English mathematical reasoning (GSM8K) shows a decrease compared to the base Llama3-8B-Instruct, dropping from 0.7475 to 0.652. Similarly, CEVAL scores saw a slight reduction from 0.5089 to 0.4903. This indicates a trade-off where specialized Chinese and agent capabilities are prioritized.

Good For

  • Developing AI agents that require strong Chinese language understanding and interactive capabilities.
  • Applications needing ReACT-style agent prompting for complex task execution.
  • General Chinese instruction-following tasks and conversational AI.