f0rc3ps/Qwen2-7B-Instruct

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 5, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Qwen2-7B-Instruct is a 7.6 billion parameter instruction-tuned causal language model developed by Qwen, based on the Transformer architecture. It features SwiGLU activation, attention QKV bias, and group query attention, supporting a context length of up to 131,072 tokens through YARN. This model demonstrates strong performance across various benchmarks, including language understanding, generation, multilingual capabilities, coding, mathematics, and reasoning, making it suitable for diverse general-purpose AI applications.

Loading preview...

Overview

Qwen2-7B-Instruct is a 7.6 billion parameter instruction-tuned model from the Qwen2 series, developed by Qwen. It is built on the Transformer architecture, incorporating features like SwiGLU activation, attention QKV bias, and group query attention. The model has been pretrained on a large dataset and further refined with supervised finetuning and direct preference optimization.

Key Capabilities & Features

  • Extended Context Window: Supports processing up to 131,072 tokens using YARN (Yet Another RoPE extention) for long text handling.
  • Multilingual Support: Utilizes an improved tokenizer adaptive to multiple natural languages and code.
  • Strong Benchmark Performance: Outperforms many similar-sized open-source models, including Qwen1.5-7B-Chat, across various benchmarks.
    • Coding: Achieves 79.9 on HumanEval, 67.2 on MBPP, and 70.3 on Evalplus.
    • Mathematics: Scores 82.3 on GSM8K and 49.6 on MATH.
    • English & Chinese: Demonstrates competitive results on MMLU (70.5), MMLU-Pro (44.1), C-Eval (77.2), and AlignBench (7.21).

When to Use This Model

  • General-Purpose Applications: Suitable for a wide range of tasks requiring strong language understanding and generation.
  • Long Context Processing: Ideal for use cases that involve extensive inputs, such as document analysis or summarization, due to its 131K token context window.
  • Coding & Mathematical Tasks: Recommended for applications requiring robust performance in code generation and complex mathematical problem-solving.
  • Multilingual Scenarios: Effective for applications needing support across various languages.