dp66/UMA-4B
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Jan 14, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

dp66/UMA-4B is a 4 billion parameter causal language model, fine-tuned using agentic Reinforcement Learning (RL). Built upon the Qwen3-4B-Instruct-2507 base model, it features a 32768 token context length. This model is optimized for agentic tasks, leveraging its RL fine-tuning to enhance performance in complex, multi-step interactions.

Loading preview...