vmo247/vmo-gemma-4-26b-a4b-ft
The vmo247/vmo-gemma-4-26b-a4b-ft is a 26 billion parameter multimodal Mixture-of-Experts (MoE) model from the Gemma 4 family by Google DeepMind, featuring 3.8 billion active parameters for efficient inference. This model excels in reasoning, coding, and multimodal understanding, processing text, images, and video with a 256K token context window. It is designed for scalable deployment on consumer GPUs and workstations, offering enhanced agentic capabilities and native function-calling support.
Loading preview...
Gemma 4 26B A4B MoE Overview
This model is part of the Gemma 4 family developed by Google DeepMind, offering a 26 billion parameter Mixture-of-Experts (MoE) architecture with 3.8 billion active parameters. This design allows for faster inference, performing almost as quickly as a 4B-parameter model while leveraging the capabilities of a larger model. It supports a substantial 256K token context window and is multimodal, capable of processing text, images, and video inputs to generate text outputs. The model is built with a hybrid attention mechanism for efficient long-context processing.
Key Capabilities
- Reasoning: Designed with configurable thinking modes for enhanced problem-solving.
- Multimodality: Processes text, images (with variable aspect ratio and resolution), and video, allowing for interleaved inputs.
- Coding & Agentic Capabilities: Achieves strong performance in coding benchmarks and includes native function-calling support for autonomous agents.
- Long Context: Features a 256K token context window, suitable for complex, long-context tasks.
- Native System Prompt Support: Enables more structured and controllable conversations.
Good For
- Reasoning-intensive tasks: Leveraging its built-in thinking mode.
- Multimodal applications: Integrating text, image, and video understanding.
- Code generation and agentic workflows: Due to its enhanced coding and function-calling support.
- Deployment on consumer GPUs: Offering efficient performance for its size.