tuandunghcmut/gemma-3-1b-it-qwen3-tool-template

Warm
Public
1B
BF16
32768
Jan 25, 2026
License: gemma
Hugging Face
Overview

Model Overview

This model is a specialized variant of Google DeepMind's Gemma 3 1B-IT, a 1 billion parameter instruction-tuned multimodal language model. While retaining the core Gemma 3 capabilities for text and image input, its key differentiator is the integration of a Qwen3-style tool-calling template. This allows the model to generate structured tool calls and process tool responses, making it highly suitable for applications requiring external function execution.

Key Capabilities

  • Tool-Calling: Emits Qwen3-style <tool_call> and processes <tool_response> tags, supporting single or multiple parallel tool calls within an assistant turn.
  • Multimodal Input: Handles both text and image inputs, with images normalized to 896x896 resolution.
  • Instruction-Tuned: Optimized for following instructions and generating coherent, relevant text.
  • Context Window: Features a 32K token context window for the 1B size, enabling processing of longer inputs.
  • Multilingual Support: The base Gemma 3 models support over 140 languages.

Good for

  • Tool-Augmented AI: Ideal for building agents that can interact with external APIs and tools.
  • Conversational AI: Enhances chatbots and virtual assistants with the ability to perform actions via tool calls.
  • Text Generation: Suitable for various text generation tasks, including creative writing, summarization, and question answering.
  • Image Understanding: Can analyze image content and generate textual responses or integrate visual information into tool calls.