Name: choco800/qwen3-4b-agent-v13 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: choco800

choco800/qwen3-4b-agent-v13: Agent Trajectory Model

This model is a fully merged 4 billion parameter Qwen3-based instruction-tuned model, fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using Unsloth. Unlike standard adapter repositories, this release includes the merged weights, simplifying deployment as it does not require loading a separate base model.

Key Capabilities & Training Focus

The primary objective of this model's training was to significantly improve multi-turn agent task performance. It is specifically optimized for scenarios requiring:

Environment observation
Action selection
Effective tool use
Recovery from errors within multi-turn trajectories.

Training focused on tasks within the ALFWorld (household tasks) environment, applying loss to all assistant turns to reinforce learning across the entire interaction sequence. The model was trained for 1 epoch with a maximum sequence length of 8192 tokens, utilizing LoRA (r=16, alpha=32) and a learning rate of 5e-06.

Data & Licensing

The model was trained on a combination of datasets including u-10bei/dbbench_sft_dataset_react (v1-v4), which are available on Hugging Face Hub under the MIT License. Users must adhere to both the dataset licenses and the base model's original Apache 2.0 terms of use.

Overview

choco800/qwen3-4b-agent-v13: Agent Trajectory Model

Key Capabilities & Training Focus

Data & Licensing

Full Model Card (README)