Name: tangger/Qwen-7B-Chat API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: tangger

tangger/Qwen-7B-Chat: An Instruction-Tuned AI Assistant

This model, tangger/Qwen-7B-Chat, is a 7-billion parameter instruction-tuned variant of Alibaba Cloud's Qwen (Tongyi Qianwen) large language model series. It is built on a Transformer architecture and was pretrained on an extensive and diverse dataset encompassing web texts, books, and code. The tangger version is a re-upload of the Qwen-7B-Chat model from September 11, provided as a temporary backup.

Key Capabilities & Features

Strong Multilingual Performance: Achieves competitive results on both Chinese (C-Eval: 54.2% Avg. Acc.) and English (MMLU: 53.9% Avg. Acc.) understanding benchmarks among models of similar scale.
Coding Proficiency: Demonstrates solid coding abilities, scoring 24.4 Pass@1 on HumanEval.
Mathematical Reasoning: Performs well on mathematical tasks, with 41.1% Zero-shot Acc. on GSM8K.
Extended Context Length: Supports a context length of 32768 tokens, with techniques like NTK-aware interpolation and LogN attention scaling for long-context understanding (e.g., 16.6 Rouge-L on VCSUM).
Advanced Tool Usage: Excels in tool-use capabilities, supporting ReAct Prompting with 99% Tool Selection accuracy and low false positive rates, and functions effectively as a HuggingFace Agent.
Efficient Quantization: Offers an Int4 quantized version (Qwen/Qwen-7B-Chat-Int4) that provides nearly lossless performance with improved inference speed and reduced memory usage (e.g., 45.60 tokens/s for 2048 tokens vs. 30.53 for BF16).
Optimized Tokenization: Utilizes a 150K+ token vocabulary based on tiktoken, optimized for efficient encoding of Chinese, English, and code, with multilingual friendliness.

Good for

Developing AI assistants requiring strong conversational and reasoning abilities.
Applications needing robust performance in both Chinese and English language tasks.
Code generation and mathematical problem-solving.
Scenarios demanding efficient processing of long text contexts.
Integrating with external tools and APIs via ReAct prompting or as a HuggingFace Agent, especially where high tool selection accuracy is critical.
Deployment in environments with memory constraints, leveraging its efficient Int4 quantization.

Overview

tangger/Qwen-7B-Chat: An Instruction-Tuned AI Assistant

Key Capabilities & Features

Good for

Full Model Card (README)