unsloth/gemma-3-27b-it-qat

Warm
Public
Vision
27B
FP8
32768
Apr 21, 2025
License: gemma
Hugging Face
Overview

Overview

unsloth/gemma-3-27b-it-qat is a 27 billion parameter instruction-tuned model from Google DeepMind's Gemma 3 family. It is a multimodal model, capable of processing both text and image inputs (normalized to 896x896 resolution) and generating text outputs. A key differentiator for this specific model is its use of Quantization Aware Training (QAT), which allows it to preserve quality comparable to bfloat16 models while substantially reducing memory footprint, making it deployable on devices with limited resources.

Key Capabilities

  • Multimodal Understanding: Handles text and image inputs, generating text outputs for tasks like image analysis and visual data extraction.
  • Extensive Context Window: Features a large 128K token context window, enabling processing of lengthy inputs.
  • Multilingual Support: Trained on data including over 140 languages, supporting diverse linguistic applications.
  • Optimized for Deployment: QAT enables efficient deployment by reducing memory requirements without significant quality loss.
  • Broad Task Performance: Excels in text generation, summarization, question answering, and reasoning tasks.

Training and Performance

The 27B model was trained on a massive 14 trillion token dataset comprising web documents, code, mathematics, and images. It demonstrates strong performance across various benchmarks:

  • Reasoning & Factuality: Achieves 85.6 on HellaSwag (10-shot) and 77.7 on BIG-Bench Hard (few-shot).
  • STEM & Code: Scores 78.6 on MMLU (5-shot) and 48.8 on HumanEval (0-shot).
  • Multilingual: Reaches 74.3 on MGSM and 75.7 on Global-MMLU-Lite.
  • Multimodal: Attains 85.6 on DocVQA and 72.9 on VQAv2.

Good For

  • Resource-constrained environments: Ideal for deployment on laptops, desktops, or private cloud infrastructure due to QAT optimization.
  • Multimodal applications: Suitable for tasks requiring both text and image understanding.
  • Global applications: Its extensive multilingual support makes it valuable for diverse language use cases.
  • Research and development: Serves as a robust foundation for experimenting with VLM and NLP techniques.