Name: Gen-Verse/DemyAgent-4B API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Gen-Verse

DemyAgent-4B: Agentic Reasoning with Reinforcement Learning

DemyAgent-4B, developed by Gen-Verse, is a 4 billion parameter model specifically designed for agentic reasoning tasks. It leverages a novel GRPO-TCR training recipe and 30,000 high-quality agentic RL data points to achieve competitive performance against significantly larger models (14B/32B parameters).

Key Capabilities & Differentiators

Exceptional Reasoning: Achieves state-of-the-art results on AIME2025 (70.0%) and strong performance on AIME2024 (72.6%) and GPQA-Diamond (58.5%), often outperforming models with 4-8x more parameters.
Efficient Agentic Performance: Demonstrates that effective Reinforcement Learning strategies, particularly with high-quality, real end-to-end trajectories, enable smaller models to excel in complex agentic tasks.
Optimized Tool Use: Employs deliberative reasoning with selective tool calls, providing superior efficiency compared to long-CoT models.
Data-Driven Approach: Highlights the critical role of data quality, training efficiency (exploration-friendly techniques), and reasoning strategy in agentic RL.

Ideal Use Cases

Complex Problem Solving: Suited for applications requiring advanced mathematical, scientific, and code-related reasoning.
Resource-Constrained Environments: Offers a powerful solution for agentic tasks where computational resources are limited, due to its efficient performance at a smaller scale.
Agent Development: Useful for developers building intelligent agents that require robust reasoning and strategic tool invocation.

Overview

DemyAgent-4B: Agentic Reasoning with Reinforcement Learning

Key Capabilities & Differentiators

Ideal Use Cases

Full Model Card (README)