thelamapi/next-12b

VISIONConcurrency Cost:1Model Size:12BQuant:FP8Ctx Length:32kPublished:Oct 27, 2025License:mitArchitecture:Transformer0.0K Open Weights Cold

Thelamapi/next-12b is a 12-billion parameter multimodal Vision-Language Model (VLM) based on Gemma 3, developed by Lamapi. It is fine-tuned for exceptional performance in both text and image understanding, offering advanced reasoning and context-aware multimodal outputs. This model provides professional-grade Turkish support alongside extensive multilingual capabilities, making it suitable for enterprises requiring complex visual understanding and creative generation.

Loading preview...

Next 12B: Türkiye's Advanced Multimodal VLM

Next 12B is a 12-billion parameter multimodal Vision-Language Model (VLM) built on Gemma 3, developed by Lamapi. It is specifically fine-tuned to deliver high performance in both text and image understanding, positioning itself as Türkiye's most advanced open-source vision-language model. The model excels in superior understanding and generation of text and image descriptions, advanced reasoning, and context-aware multimodal outputs.

Key Capabilities

  • Multimodal Vision-Language: Deep understanding of images with sophisticated visual reasoning capabilities.
  • Multilingual Support: Offers professional-grade Turkish language support while maintaining extensive multilingual reach.
  • Superior Reasoning: Demonstrates strong logical and analytical reasoning for complex tasks, achieving 92.7% on MMLU and 95.3% on GSM8K benchmarks.
  • Optimized Architecture: Balanced performance and efficiency, supporting various quantization formats for flexible deployment.

Ideal Use Cases

  • Advanced Visual Analysis: Detailed image understanding and description.
  • Enterprise Content Generation: High-quality multilingual content creation.
  • Complex Reasoning: Multimodal QA, educational applications, and research assistance.
  • Production-Ready: Designed for enterprise deployment with reliable and consistent outputs.