Name: melon1891/agentbench-qwen3-4b-2stage-reasoning-20260228 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: melon1891

Overview

This model, melon1891/agentbench-qwen3-4b-2stage-reasoning-20260228, is a 4 billion parameter language model fine-tuned from melon1891/agentbench-qwen3-4b-lr5e6-20260224v2. It leverages LoRA (merged into the base model) to enhance its capabilities, with a maximum sequence length of 8192 tokens.

Key Capabilities

Multi-turn Agent Task Performance: Specifically trained to improve performance in complex, multi-turn agentic tasks.
Environment Interaction: Learns to process environment observations and select appropriate actions.
Tool Use: Developed to effectively utilize tools within agent workflows.
Error Recovery: Designed to recover from errors during task execution, contributing to more robust agent behavior.
Targeted Domains: Optimized for tasks in ALFWorld (household tasks) and DBBench (database operations).

Training Details

The model was trained for 3 epochs with a learning rate of 1e-06, using the melon1891/reasoning-chain-distilled-317 dataset. Loss was applied to all assistant turns in the multi-turn trajectory to reinforce learning across the entire task sequence.

Good For

Developing AI agents that require sequential reasoning and decision-making.
Applications involving complex, multi-step interactions with environments.
Tasks in household automation (ALFWorld-like scenarios) or database management (DBBench-like scenarios).

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)