Name: choco800/qwen3-4b-agent-v10 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: choco800

Overview

The choco800/qwen3-4b-agent-v10 is a 4 billion parameter language model, fully merged and fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model using Unsloth. Unlike adapter repositories, this model provides merged weights, eliminating the need to load a separate base model.

Key Capabilities

Multi-turn Agent Task Performance: Specifically trained to improve performance in multi-turn agentic scenarios.
Environment Interaction: Learns to process environment observations and select appropriate actions.
Tool Use: Developed with a focus on effective tool integration and utilization.
Error Recovery: Designed to handle and recover from errors within complex task trajectories.
Targeted Domains: Optimized for tasks within ALFWorld (household tasks) and DBBench (database operations).

Training Details

The model was trained with a focus on applying loss to all assistant turns in multi-turn trajectories, ensuring comprehensive learning across observation, action, and error handling. Key training configurations include a maximum sequence length of 8192, 1 epoch, and a learning rate of 1e-05, utilizing LoRA with r=16 and alpha=32. Loss was computed exclusively on the assistant's responses, masking user prompts and observations.

Good For

Developing AI agents that require robust multi-turn interaction.
Applications involving complex task execution in simulated environments.
Scenarios demanding precise tool use and error handling capabilities.

Overview

Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)