Overview
modelscope/Llama3-Chinese-8B-Instruct-Agent-v1 is an 8 billion parameter instruction-tuned model built upon the Llama3-8B-Instruct base. Developed by ModelScope, this model is specifically enhanced for general Chinese language applications and features support for ReACT-formatted agent interactions.
Key Capabilities
- Chinese Language Adaptation: Fine-tuned with a diverse mix of Chinese datasets, including COIG-CQIA (covering traditional Chinese knowledge, social media, and Q&A platforms) and the ModelScope general Chinese Q&A dataset (ms-bench).
- Agentic Reasoning: Designed to support ReACT-style agent calls, enabling more complex, multi-turn interactions and tool use within agent frameworks like ModelScopeAgent.
- Training Details: The model was trained for 2 epochs with a learning rate of 5e-5, utilizing LoRA (rank 8, alpha 32) for efficient fine-tuning on a mixed dataset including Chinese and English instruction data (alpaca-en).
Performance Considerations
While optimized for Chinese and agent scenarios, the model's performance on English mathematical reasoning (GSM8K) shows a decrease compared to the base Llama3-8B-Instruct, dropping from 0.7475 to 0.652. Similarly, CEVAL scores saw a slight reduction from 0.5089 to 0.4903. This indicates a trade-off where specialized Chinese and agent capabilities are prioritized.
Good For
- Developing AI agents that require strong Chinese language understanding and interactive capabilities.
- Applications needing ReACT-style agent prompting for complex task execution.
- General Chinese instruction-following tasks and conversational AI.