Name: SCAI-JHU/MindZero-gw-tom-Qwen3-VL-8B-Instruct API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: SCAI-JHU

MindZero-gw-tom-Qwen3-VL-8B-Instruct Overview

This model is an 8 billion parameter vision-language model developed by SCAI-JHU, building upon the Qwen3-VL-8B-Instruct architecture. It is a specialized MindZero checkpoint, uniquely trained for online Theory-of-Mind (ToM) reasoning within gridworld environments. The core innovation lies in its self-supervised reinforcement learning approach, which enables the model to perform robust mental reasoning without requiring any explicit mental-state annotations during training.

Key Capabilities

Online Theory-of-Mind Reasoning: Learns to infer mental states (e.g., beliefs, intentions) of agents in real-time based on observed actions.
Self-Supervised Learning: Utilizes a novel training mechanism where the model is rewarded for generating mental-state hypotheses that maximize the likelihood of observed actions, as estimated by a planner.
Efficient Inference: After training, the model internalizes this reasoning process, allowing for fast, single-pass inference of mental states.
Vision-Language Integration: As a VL model, it processes both visual and textual inputs, crucial for understanding gridworld scenarios.

Performance

On the Gridworld-QA benchmark, this 8B parameter model achieves a score of 92.3, demonstrating its proficiency in mental reasoning tasks within these environments. For comparison, its 4B counterpart achieved 95.0.

Good For

Research and development in AI agents requiring advanced Theory-of-Mind capabilities.
Applications involving understanding and predicting agent behavior in structured, interactive environments like gridworlds.
Exploring self-supervised learning paradigms for complex cognitive tasks.

Overview

MindZero-gw-tom-Qwen3-VL-8B-Instruct Overview

Key Capabilities

Performance

Good For

Full Model Card (README)