pankajmathur/Mimma-3-4b-v3
VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kLicense:gemmaArchitecture:Transformer0.0K Cold

Mimma-3-4b-v3 is a multimodal vision-language model developed by pankajmathur, based on the Gemma 3 architecture. This model integrates text and image understanding, generating text output from both modalities. It features a large 128K context window and multilingual support across over 140 languages, making it suitable for diverse text generation and image analysis tasks like question answering, summarization, and reasoning.

Loading preview...