Name: sensenova/SenseNova-MARS-8B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: sensenova

SenseNova-MARS-8B: Multimodal Agentic Reasoning and Search

SenseNova-MARS-8B is an 8 billion parameter Vision-Language Model (VLM) designed to enhance agentic reasoning and tool-use capabilities through reinforcement learning. Unlike traditional VLMs that primarily rely on text-oriented chain-of-thought, SenseNova-MARS dynamically integrates image search, text search, and image cropping tools to address knowledge-intensive and visually complex scenarios.

Key Capabilities

Interleaved Visual Reasoning and Tool-Use: Seamlessly combines visual understanding with dynamic tool manipulation.
Integrated Tool Suite: Leverages text search, image search, and image crop tools for fine-grained visual analysis.
Reinforcement Learning Optimization: Utilizes the Batch-Normalized Group Sequence Policy Optimization (BN-GSPO) algorithm for stable training and effective tool invocation.
High-Resolution Image Handling: Evaluated on the HR-MMSearch benchmark, specifically designed for high-resolution images and knowledge-intensive, search-driven questions.

Performance Highlights

SenseNova-MARS-8B achieves competitive performance on search-oriented benchmarks. In agentic model evaluations, it scores 67.84 on MMSearch and 41.64 on HR-MMSearch, demonstrating robust capabilities in complex visual tasks requiring external tools. The larger SenseNova-MARS-32B variant even surpasses proprietary models like Gemini-3-Pro and GPT-5.2 on these benchmarks.

Good for

Applications requiring advanced multimodal reasoning with dynamic tool integration.
Tasks involving knowledge-intensive visual understanding and search.
Scenarios demanding coordinated use of image search, text search, and image cropping.

Overview

SenseNova-MARS-8B: Multimodal Agentic Reasoning and Search

Key Capabilities

Performance Highlights

Good for

Full Model Card (README)