openlmlab/open-chinese-llama-7b-patch

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 24, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Open-Chinese-LLaMA-7B-Patch by OpenLMLab is a 7 billion parameter LLaMA-based model incrementally pre-trained on Chinese datasets, designed to significantly enhance Chinese language understanding and generation capabilities. This patch-type model must be applied to an original LLaMA-7B base model. It excels in various Chinese downstream tasks and demonstrates improved performance over the original LLaMA in both Chinese and English benchmarks, including code generation.

Loading preview...

Open-Chinese-LLaMA-7B-Patch Overview

Open-Chinese-LLaMA-7B-Patch is a 7 billion parameter large language model developed by OpenLMLab. It is built upon the LLaMA-7B architecture and has undergone incremental pre-training using extensive Chinese datasets. This process significantly boosts its proficiency in Chinese language understanding and generation compared to the original LLaMA model.

Key Capabilities & Features

  • Enhanced Chinese Performance: Demonstrates substantial improvements in various Chinese downstream tasks, as evidenced by evaluation results on datasets like OCNLI, CHID, TNEWS, and CMRC.
  • Patch-Based Deployment: Released as a patch that must be applied to an existing official LLaMA-7B model to comply with licensing. Tools are provided for this patching process.
  • Hugging Face Compatibility: The patched model is fully compatible with the Hugging Face transformers library, allowing for easy integration and use.
  • Multilingual Improvement: While primarily focused on Chinese, it also shows improved or comparable performance on some English tasks, such as HumanEval for code generation.
  • Code Generation: Examples provided in the README illustrate its capability in generating code.

When to Use This Model

This model is particularly well-suited for applications requiring strong Chinese language processing capabilities, including text generation, understanding, and various NLP tasks in Chinese. Its patch-based nature requires users to have access to the original LLaMA-7B weights. It offers a robust foundation for developing Chinese-centric LLM applications.