Name: Accio-Lab/Metis-8B-RL API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Accio-Lab

Metis-8B-RL: A Strategic Multimodal Reasoning Agent

Metis-8B-RL, developed by Accio-Lab, is an 8 billion parameter multimodal model built upon Qwen3-VL-8B-Instruct. It is the final RL-trained checkpoint of the Metis framework, utilizing Hierarchical Decoupled Policy Optimization (HDPO) to cultivate meta-cognitive tool use.

Key Capabilities & Differentiators

Efficient Tool Use: Drastically reduces blind tool invocation (from 98% to 2%) by learning when to use external tools like code execution, text search, and image search, rather than just how.
State-of-the-Art Performance: Achieves leading accuracy across 13 benchmarks among open-source 8B agentic models, demonstrating strong capabilities in perception, document understanding, and complex mathematical/logical reasoning.
HDPO Training: Employs a novel HDPO method with dual rewards and decoupled advantage estimation, allowing the model to first prioritize correctness and then optimize for tool efficiency.

Ideal Use Cases

Complex Multimodal Reasoning: Suited for tasks requiring strategic integration of visual and textual information with external tools.
Agentic Applications: Excellent for building intelligent agents that need to make informed decisions about when to invoke specific functionalities.
Problem Solving: Particularly strong in mathematical and logical reasoning, making it valuable for applications requiring precise problem-solving.

Overview

Metis-8B-RL: A Strategic Multimodal Reasoning Agent

Key Capabilities & Differentiators

Ideal Use Cases

Full Model Card (README)