Name: nics-efc/MARSHAL-Kuhn-Poker-Qwen3-4B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: nics-efc

MARSHAL: Kuhn Poker Specialist

This model, nics-efc/MARSHAL-Kuhn-Poker-Qwen3-4B, is a 4 billion parameter variant of Qwen3-4B, specifically fine-tuned as a Kuhn Poker specialist within the innovative MARSHAL framework. MARSHAL is an end-to-end reinforcement learning framework designed to enhance multi-agent reasoning through self-play in various competitive and cooperative games. It addresses complex credit assignment challenges in multi-agent, multi-turn scenarios.

Key Capabilities

Specialized Game Play: Expert performance in Kuhn Poker, a competitive imperfect-information game.
Advanced Credit Assignment: Utilizes a Turn-level Advantage Estimator for precise attribution of long-term outcomes to individual actions.
Stable Training: Employs Agent-specific Advantage Normalization to stabilize the training process by calibrating advantage estimates.
Generalization to Reasoning: Demonstrates notable generalization, yielding performance improvements on reasoning benchmarks when integrated into leading multi-agent systems (MASs).

Good For

Research in Multi-Agent Reinforcement Learning: Particularly for understanding and developing strategic LLMs in game theory contexts.
Strategic Game AI Development: Ideal for applications requiring agents capable of complex decision-making in competitive, imperfect-information environments.
Enhancing Multi-Agent Systems: Can be integrated into MASs to boost performance on reasoning tasks, showing gains of up to +10.0% on AIME and +7.6% on GPQA-Diamond.

Overview

MARSHAL: Kuhn Poker Specialist

Key Capabilities

Good For

Full Model Card (README)