jordimas/gemma-3-1b-it

Warm
Public
1B
BF16
32768
Dec 29, 2025
License: gemma
Hugging Face
Overview

Gemma 3: Multimodal, Lightweight, and Open

Gemma 3 is a family of open, multimodal models developed by Google DeepMind, leveraging the same research and technology as the Gemini models. This instruction-tuned variant, jordimas/gemma-3-1b-it, is a 1 billion parameter model designed for efficient deployment.

Key Capabilities

  • Multimodal Input: Handles both text strings and images (normalized to 896x896 resolution, encoded to 256 tokens each) as input.
  • Text Generation: Generates text outputs for a wide range of tasks.
  • Context Window: Features a 32K token input context window for the 1B size, and an 8192 token output context.
  • Multilingual Support: Supports over 140 languages, with training data including diverse web documents, code, and mathematical text.
  • Resource-Efficient: Its relatively small size makes it suitable for deployment on devices with limited resources, such as laptops or desktops.

Training and Performance

The 1B model was trained on 2 trillion tokens, encompassing web documents, code, mathematics, and images. Training involved rigorous CSAM and sensitive data filtering. Benchmarks show its performance across reasoning, STEM, code, and multilingual tasks. For instance, the 1B model achieves 62.3 on HellaSwag (10-shot) and 73.8 on PIQA (0-shot).

Good for

  • Content Creation: Generating creative text formats, marketing copy, or email drafts.
  • Conversational AI: Powering chatbots and virtual assistants.
  • Text Summarization: Creating concise summaries of documents.
  • Image Understanding: Extracting and interpreting visual data for text communications.
  • Research & Education: Serving as a foundation for VLM and NLP research, language learning tools, and knowledge exploration.