gghfez/amoral-gemma3-12B-vision
VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Mar 21, 2025License:gemmaArchitecture:Transformer0.0K Cold
The gghfez/amoral-gemma3-12B-vision model is a 12 billion parameter vision-capable language model, reattaching the vision encoder to the soob3123/amoral-gemma3-12B base. This model is designed for detailed image description and multimodal understanding, offering enhanced visual analysis compared to its text-only counterparts. It processes both image and text inputs, making it suitable for applications requiring comprehensive visual content analysis. With a context length of 32768 tokens, it can handle extensive multimodal prompts.
Loading preview...