tbmod/gemma-3-4b-it

VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Dec 11, 2025License:gemmaArchitecture:Transformer0.0K Cold

Gemma 3, developed by Google DeepMind, is a family of lightweight, multimodal open models built from the same research as Gemini. This 4.3 billion parameter instruction-tuned variant handles both text and image inputs, generating text outputs, and features a 32K token context window. It excels in various text generation and image understanding tasks, including question answering, summarization, and reasoning, making it suitable for resource-limited environments.

Loading preview...

Overview

Gemma 3 is a family of lightweight, multimodal open models from Google DeepMind, leveraging the same research and technology as the Gemini models. This instruction-tuned variant, tbmod/gemma-3-4b-it, is a 4.3 billion parameter model capable of processing both text and image inputs to generate text outputs. It supports a 32K token context window and multilingual capabilities across over 140 languages.

Key Capabilities

  • Multimodal Input: Processes text strings and images (normalized to 896x896 resolution, encoded to 256 tokens each).
  • Text Generation: Generates creative text formats, powers chatbots, and performs text summarization.
  • Image Understanding: Extracts, interprets, and summarizes visual data for text communications.
  • Multilingual Support: Trained on data including content in over 140 languages.
  • Reasoning & Factual Accuracy: Evaluated across various benchmarks for reasoning, STEM, code, and multilingual tasks.

Training & Hardware

The 4B model was trained on 4 trillion tokens, including web documents, code, mathematics, and images. Training utilized Google's Tensor Processing Unit (TPU) hardware (TPUv4p, TPUv5p, and TPUv5e) for performance, memory, and scalability, with software built on JAX and ML Pathways.

Good For

  • Content Creation: Generating diverse text formats and marketing copy.
  • Conversational AI: Developing chatbots and virtual assistants.
  • Research & Education: Serving as a foundation for VLM/NLP research and language learning tools.
  • Resource-Limited Environments: Its relatively small size makes it suitable for deployment on laptops, desktops, or private cloud infrastructure.