Qwen/Qwen-7B-Chat

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:32kPublished:Aug 3, 2023License:tongyi-qianwen-license-agreementArchitecture:Transformer0.8K Cold

Qwen/Qwen-7B-Chat is a 7-billion parameter Transformer-based large language model developed by Alibaba Cloud, pretrained on a diverse dataset including web texts, books, and code. This chat-optimized version is fine-tuned with alignment techniques, offering strong performance in Chinese and English understanding, coding, and mathematical reasoning. It features a 32k context length and excels in tool usage and long-context understanding.

Loading preview...

Overview

Qwen-7B-Chat is a 7-billion parameter, Transformer-based large language model developed by Alibaba Cloud, part of the Qwen (Tongyi Qianwen) series. It is pretrained on a vast and diverse dataset, encompassing web texts, professional books, and code. This model is specifically the chat-optimized version, fine-tuned using alignment techniques to function as an AI assistant.

Key Capabilities

  • Multilingual Proficiency: Demonstrates strong performance in both Chinese (C-Eval) and English (MMLU) understanding, outperforming several comparable models.
  • Code Generation: Achieves a Pass@1 score of 37.2 on HumanEval, indicating solid code generation capabilities.
  • Mathematical Reasoning: Scores 50.3 on GSM8K (0-shot), showcasing its ability in mathematical problem-solving.
  • Long Context Understanding: Supports a context length of 32768 tokens, with strong performance on long-text summarization tasks like VCSUM (Rouge-L 16.6).
  • Tool Usage: Excels in tool calling via ReAct Prompting, achieving 98% accuracy in tool selection and 0.91 Rouge-L for tool input on a Chinese tool-use benchmark. It also performs well as a Code Interpreter and HuggingFace Agent.
  • Quantization Support: Offers Int4 and Int8 quantized models with minimal performance degradation, significantly reducing memory usage and improving inference speed, especially with Flash Attention 2.

Good for

  • Developing AI assistants requiring strong conversational abilities in both Chinese and English.
  • Applications needing robust code generation and mathematical reasoning.
  • Scenarios demanding long-context processing and summarization.
  • Integrating with external tools and APIs through ReAct-style prompting or as a HuggingFace Agent.
  • Deployment in resource-constrained environments, leveraging its efficient quantization options.