Name: Qwen/Qwen-7B-Chat API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Qwen

Overview

Qwen-7B-Chat is a 7-billion parameter, Transformer-based large language model developed by Alibaba Cloud, part of the Qwen (Tongyi Qianwen) series. It is pretrained on a vast and diverse dataset, encompassing web texts, professional books, and code. This model is specifically the chat-optimized version, fine-tuned using alignment techniques to function as an AI assistant.

Key Capabilities

Multilingual Proficiency: Demonstrates strong performance in both Chinese (C-Eval) and English (MMLU) understanding, outperforming several comparable models.
Code Generation: Achieves a Pass@1 score of 37.2 on HumanEval, indicating solid code generation capabilities.
Mathematical Reasoning: Scores 50.3 on GSM8K (0-shot), showcasing its ability in mathematical problem-solving.
Long Context Understanding: Supports a context length of 32768 tokens, with strong performance on long-text summarization tasks like VCSUM (Rouge-L 16.6).
Tool Usage: Excels in tool calling via ReAct Prompting, achieving 98% accuracy in tool selection and 0.91 Rouge-L for tool input on a Chinese tool-use benchmark. It also performs well as a Code Interpreter and HuggingFace Agent.
Quantization Support: Offers Int4 and Int8 quantized models with minimal performance degradation, significantly reducing memory usage and improving inference speed, especially with Flash Attention 2.

Good for

Developing AI assistants requiring strong conversational abilities in both Chinese and English.
Applications needing robust code generation and mathematical reasoning.
Scenarios demanding long-context processing and summarization.
Integrating with external tools and APIs through ReAct-style prompting or as a HuggingFace Agent.
Deployment in resource-constrained environments, leveraging its efficient quantization options.

Overview

Overview

Key Capabilities

Good for

Full Model Card (README)