thelamapi/next-12b

Warm
Public
Vision
12B
FP8
32768
1
Oct 27, 2025
License: mit
Hugging Face
Overview

Next 12B: Türkiye's Advanced Vision-Language Model

Next 12B is a 12-billion parameter multimodal Vision-Language Model (VLM) built on Gemma 3, developed by Lamapi. It is distinguished as Türkiye's most advanced open-source VLM, excelling in both text and image understanding. The model is fine-tuned for superior performance in generating text and image descriptions, advanced reasoning, and context-aware multimodal outputs.

Key Capabilities

  • Multimodal Understanding: Deep comprehension of images combined with sophisticated visual reasoning.
  • Multilingual Support: Offers industry-leading Turkish language performance while maintaining extensive multilingual capabilities.
  • Superior Reasoning: Demonstrates strong logical and analytical reasoning for complex tasks, achieving 92.7% on MMLU, 95.3% on GSM8K, and 87.2% on MATH benchmarks.
  • Optimized Architecture: Balanced for performance and efficiency, supporting various quantization formats (Q8_0, Q4_K_M, F16, F32) for flexible deployment.

Good For

  • Enterprise Applications: Reliable and consistent outputs for production deployments, including high-quality multilingual content generation and customer support automation.
  • Advanced Visual Analysis: Detailed image understanding, multimodal QA, and visual document processing.
  • Complex Reasoning: Ideal for educational systems, research assistance, and creative storytelling requiring advanced analytical capabilities.