Overview
Gemma 3 4.3B Pre-trained Model Overview
Google DeepMind's Gemma 3 is a family of lightweight, multimodal open models, leveraging the same research and technology as the Gemini models. This 4.3 billion parameter pre-trained variant is designed for both text and image input, generating text output, and features a substantial 32,768-token context window. It offers multilingual support across more than 140 languages.
Key Capabilities
- Multimodal Input: Processes both text strings (questions, prompts, documents) and images (normalized to 896x896 resolution, encoded to 256 tokens each).
- Text Generation: Generates diverse text outputs, including answers, image content analysis, and document summaries.
- Extensive Context: Supports a total input context of 32,768 tokens for this size, enabling processing of longer and more complex inputs.
- Multilingual Support: Trained on web documents in over 140 languages, enhancing its global applicability.
- Reasoning & STEM: Demonstrates strong performance across various reasoning, STEM, and code benchmarks, including MMLU, MATH, and HumanEval.
Intended Use Cases
- Content Creation: Ideal for generating creative text formats, marketing copy, and email drafts.
- Conversational AI: Suitable for powering chatbots, virtual assistants, and interactive applications.
- Information Extraction: Can extract, interpret, and summarize visual data for text communications.
- Research & Education: Serves as a foundation for VLM and NLP research, language learning tools, and knowledge exploration.