Name: Lightricks/gemma-3-12b-it-qat-q4_0-unquantized API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Lightricks

Gemma 3 12B Instruction-Tuned QAT Model

This model is a 12 billion parameter instruction-tuned variant from Google DeepMind's Gemma 3 family, utilizing Quantization Aware Training (QAT). While the provided checkpoint is unquantized, it's designed for subsequent Q4_0 quantization to achieve significant memory reduction with minimal quality loss compared to bfloat16.

Key Capabilities

Multimodal: Processes both text and image inputs (images normalized to 896x896 resolution, encoded to 256 tokens each) and generates text outputs.
Extensive Context: Features a large 128K token input context window, enabling processing of lengthy inputs.
Multilingual Support: Supports over 140 languages for diverse applications.
Optimized for Deployment: QAT enables efficient deployment in environments with limited resources like laptops, desktops, or private cloud infrastructure.
Broad Task Performance: Excels in text generation and image understanding tasks, including question answering, summarization, and reasoning.

Good For

Content Creation: Generating creative text formats, marketing copy, and email drafts.
Conversational AI: Powering chatbots and virtual assistants.
Research & Education: Serving as a foundation for VLM/NLP research, language learning tools, and knowledge exploration.
Image Data Extraction: Interpreting and summarizing visual data for text communications.

Overview

Gemma 3 12B Instruction-Tuned QAT Model

Key Capabilities

Good For

Full Model Card (README)