Name: janhq/Jan-v2-VL-low API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: janhq

Jan-v2-VL: Multimodal Agent for Long-Horizon Tasks

Jan-v2-VL is an 8-billion parameter vision–language model developed by janhq, specifically engineered for executing long-horizon, multi-step tasks within real software environments such as browsers and desktop applications. It integrates language reasoning with visual perception, enabling it to follow complex instructions, manage intermediate states, and recover from minor execution errors.

Key Capabilities

Long-Horizon Execution: Built for stable, many-step execution, crucial for real-world tasks where small per-step gains compound into longer successful chains. Evaluated using the "Illusion of Diminishing Returns" benchmark, which measures execution length and aligns with robust long-horizon ability.
Multimodal Perception: Combines language understanding with visual input to interact effectively with graphical user interfaces.
Performance: Shows no degradation on standard text-only and vision tasks compared to its base model (Qwen-3-VL-8B-Thinking), and even performs slightly better on several, while delivering stronger long-horizon execution.

Good For

Agentic Automation & UI Control: Ideal for stepwise operations in browsers and desktop applications, leveraging screenshot grounding and tool calls (e.g., BrowserMCP).
Complex Task Automation: Suitable for tasks where a plan or knowledge can be provided upfront, and success depends on stable, multi-step execution with minimal drift.

This specific variant, Jan-v2-VL-low, is optimized for efficiency and lower latency, making it a practical choice for applications requiring responsive agentic behavior.

Overview

Jan-v2-VL: Multimodal Agent for Long-Horizon Tasks

Key Capabilities

Good For

Full Model Card (README)