zzwkk/MUA-RL-14B
TEXT GENERATIONConcurrency Cost:1Model Size:14BQuant:FP8Ctx Length:32kPublished:Aug 25, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

zzwkk/MUA-RL-14B is a 14 billion parameter multi-turn user-interacting agent reinforcement learning model with a 32K context length, developed by zzwkk. It is specifically designed for agentic tool use in multi-turn conversation scenarios, enabling autonomous learning to communicate with users and utilize tools effectively. This model integrates LLM-simulated users into its reinforcement learning loop, making it highly effective for complex task completion requiring sustained context and tool interaction.

Loading preview...