PengQu/open_llama_7b_v2_vicuna_Chinese
PengQu/open_llama_7b_v2_vicuna_Chinese is a 7 billion parameter chat model fine-tuned on open_llama_7b_v2, designed for conversational tasks in both English and Chinese. This model supports a 4096-token context length and is commercially usable. It demonstrates comparable English performance to Vicuna-7B while outperforming it in Chinese and programming capabilities, likely due to its base model's training on the StarCoder dataset.
Loading preview...
Model Overview
PengQu/open_llama_7b_v2_vicuna_Chinese is a 7 billion parameter conversational model developed by PengQu. It is built upon the commercially permissible open_llama_7b_v2 foundation model and has been fine-tuned using a combination of ShareGPT, ShareGPT-ZH, and Langchain-MRKL-finetune datasets. The training process leveraged the FastChat framework.
Key Capabilities and Differentiators
- Bilingual Performance: Achieves English performance on par with Vicuna-7B and demonstrates superior performance in Chinese conversational tasks compared to Vicuna-7B.
- Enhanced Programming Ability: Exhibits better programming capabilities than Vicuna-7B, attributed to the
open_llama_7b_v2base model's likely inclusion of the StarCoder dataset during its pre-training. - Commercial Use: The underlying
open_llama_7b_v2model allows for commercial applications. - Langchain-MRKL Support: Integrates support for the Langchain-MRKL format, specifically for
agent="zero-shot-react-description".
Usage Notes
When using Hugging Face Transformers, it is recommended to avoid the fast tokenizer by either using LlamaTokenizer directly or setting use_fast=False with AutoTokenizer to prevent potential incorrect tokenizations.