Alibaba-DAMO-Academy/RynnBrain-Nav-8B

VISIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Jan 28, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

RynnBrain-Nav-8B by Alibaba-DAMO-Academy is an 8 billion parameter embodied foundation model, built upon Qwen3-VL-8B-Instruct, specifically fine-tuned for visual language navigation and path planning tasks. It excels at integrating vision and language instructions to perform precise navigation in physical spaces. This model is designed to provide physics-aware planning outputs for downstream robotic systems, offering comprehensive egocentric understanding and spatiotemporal localization.

Loading preview...

RynnBrain-Nav-8B: Embodied Navigation Model

RynnBrain-Nav-8B, developed by Alibaba-DAMO-Academy, is an 8 billion parameter embodied foundation model derived from Qwen3-VL-8B-Instruct. It is specifically designed to function as a physics-aware embodied brain, capable of observing egocentric scenes and grounding language instructions to physical space and time. This model supports downstream robotic systems by providing reliable localization and planning outputs.

Key Capabilities

  • Visual Language Navigation: Integrates visual and language instructions to perform navigation and path planning in complex environments.
  • Comprehensive Egocentric Understanding: Offers strong spatial comprehension and egocentric cognition, including embodied QA, counting, OCR, and fine-grained video understanding.
  • Diverse Spatiotemporal Localization: Locates objects, target areas, and predicts trajectories across long episodic contexts, enabling global spatial awareness.
  • Physics-Aware Planning: Provides precise planning outputs by integrating localized affordances, areas, and objects, offering detailed instructions for visual language action (VLA) models.

Good For

  • Robotics and Embodied AI: Ideal for applications requiring robots to understand and navigate physical environments based on visual and linguistic cues.
  • Path Planning: Excels in scenarios demanding precise path planning and execution guided by visual language instructions.
  • Spatial Reasoning: Suitable for tasks that benefit from strong spatial comprehension and the ability to ground language in physical reality.