thelamapi/next2-air

VISIONConcurrency Cost:1Model Size:2.3BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Mar 7, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Next2-Air is a 2.3 billion parameter Vision-Language Model (VLM) developed by Lamapi in Türkiye, built on the Qwen 3.5-2B architecture. It is optimized for lightweight, fast inference on local machines and edge devices, featuring multimodal understanding, logical reasoning with Chain-of-Thought, and native bilingual support for Turkish and English. The model supports a substantial 262,144 token context length, making it suitable for extensive document processing and real-time applications.

Loading preview...

Overview

Next2-Air is a 2.3 billion parameter Vision-Language Model (VLM) developed by Lamapi, based on the Qwen 3.5-2B architecture. It is designed for lightweight, fast, and capable performance on local machines and edge devices, emphasizing reasoning and multimodal understanding. The model is instruction-tuned using specialized datasets to enhance logical deduction and image processing, offering native support for both Turkish and English.

Key Capabilities

  • Optimized for Edge: Runs efficiently on MacBooks, mid-range PCs, and edge hardware without requiring powerful GPUs.
  • Multimodal Understanding: Processes images, performs OCR, and understands visual context.
  • Advanced Reasoning: Utilizes Chain-of-Thought (<think>) for logical deduction.
  • Extensive Context: Supports a native context length of 262,144 tokens, ideal for long document summarization.
  • Bilingual Proficiency: Fine-tuned for natural, fluent, and accurate responses in both Turkish and English.

Benchmark Performance

Next2-Air demonstrates competitive performance in the ultra-lightweight category, often surpassing its base model and competing with larger 3B-4B models. It shows improvements in text, reasoning, and instruction following benchmarks like MMLU-Pro (68.2%), MMLU-Redux (82.1%), and IFEval (82.5%). For multimodal tasks, it achieves strong results in MMMU (66.5%), MathVision (78.1%), and OCRBench (86.0%).

Ideal Use Cases

  • Mobile & Edge AI: Deploying smart assistants on smartphones or Raspberry Pi.
  • Real-Time OCR & Parsing: Quickly extracting data from receipts, invoices, or UI screenshots.
  • Fast Conversational Bots: Providing low-latency responses in Turkish and English.
  • Gaming & NPC Logic: Serving as a fast reasoning engine for dynamic in-game characters.