v1v1d1/nayana-gemma3-4b-stage1
VISIONConcurrency Cost:1Model Size:4.3BQuant:BF16Ctx Length:32kPublished:Feb 3, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The v1v1d1/nayana-gemma3-4b-stage1 is a 4.3 billion parameter fine-tuned Vision-Language Model (VLM) based on Google's Gemma-3-4b-it, developed by v1v1d1. Utilizing LoRA adapters and the MS-Swift framework, this model is specifically trained on the Nayana Docmatix Stage 1 dataset, comprising 150k samples in English, Kannada, and Hindi. It is designed for multimodal tasks, excelling at image description and understanding across these three languages.

Loading preview...