PrimeIntellect/Qwen3-4B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Sep 24, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

PrimeIntellect/Qwen3-4B is a 4 billion parameter causal language model, cloned from Qwen/Qwen3-4B. This model features a 40960 token context length and is specifically configured with a multi-turn, tool-call compatible chat template. It is designed for conversational AI applications requiring structured interaction and tool integration.

Loading preview...

PrimeIntellect/Qwen3-4B Overview

PrimeIntellect/Qwen3-4B is a 4 billion parameter language model, derived from the Qwen3-4B architecture. This version is distinguished by its specialized chat template, which supports multi-turn conversations and is compatible with tool-calling functionalities. With a substantial context window of 40960 tokens, it is engineered to handle complex and extended conversational flows.

Key Capabilities

  • Multi-turn Chat: Optimized for engaging in continuous, multi-turn dialogues.
  • Tool-Call Compatibility: Designed to integrate seamlessly with external tools and functions, enabling more dynamic and capable AI agents.
  • Extended Context: A 40960-token context length allows for processing and generating longer, more detailed responses while maintaining conversational coherence.

Good For

  • Developing conversational agents that require structured interaction.
  • Applications needing to integrate external tools or APIs through a language model.
  • Use cases demanding a large context window for complex dialogue management.