Overview
This model is an abliterated version of Gemma 3, a family of lightweight, state-of-the-art open models developed by Google DeepMind. Built using the same research and technology as the Gemini models, Gemma 3 models are multimodal, capable of processing both text and image inputs to generate text outputs. The 1 billion parameter variant features a 32K token context window and supports multilingual interactions in over 140 languages.
Key Capabilities
- Multimodal Input: Accepts text strings and images (normalized to 896x896 resolution, encoded to 256 tokens each).
- Text Generation: Excels at tasks like question answering, summarization, and creative content generation.
- Image Understanding: Capable of analyzing image content and extracting visual data.
- Multilingual Support: Trained on data including over 140 languages.
- Resource-Efficient: Its relatively small size allows deployment in environments with limited resources, such as laptops or desktops.
Training Details
Gemma 3 models were trained on a diverse dataset including web documents, code, mathematics, and images. The 1B model specifically was trained with 2 trillion tokens. Rigorous data preprocessing included CSAM filtering and sensitive data filtering to ensure safety and reliability. Training was conducted on Tensor Processing Unit (TPU) hardware using JAX and ML Pathways, aligning with Google's sustainability commitments.
Intended Usage
This model is well-suited for content creation (e.g., poems, scripts, marketing copy), chatbots, text summarization, and image data extraction. It also serves as a valuable tool for NLP and VLM research, language learning, and knowledge exploration.