Name: choco800/qwen3-4b-agent-v24 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: choco800

Model Overview

choco800/qwen3-4b-agent-v24 is a 4 billion parameter language model, fully merged and fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using Unsloth. Unlike adapter repositories, this model contains the complete merged weights, eliminating the need to load a separate base model.

Key Capabilities

This model is specifically trained to enhance multi-turn agent task performance, particularly within environments like ALFWorld (household tasks). Its training objective focuses on enabling the model to:

Learn environment observation: Understand and interpret the state of an interactive environment.
Perform action selection: Choose appropriate actions based on observations and task goals.
Utilize tools: Integrate and effectively use external tools within a task trajectory.
Recover from errors: Adapt and correct its behavior in response to unexpected outcomes or failures.

Loss was applied to all assistant turns in the multi-turn trajectory, ensuring comprehensive learning across the entire interaction sequence.

Training Details

The model was trained for 1 epoch with a maximum sequence length of 8192, using a learning rate of 7e-06. It leveraged LoRA (r=8, alpha=16) and incorporated techniques like NEFTUNE_NOISE_ALPHA=5.0 to improve training stability and performance. The training data primarily consisted of ALFWorld trajectory datasets (v3, v4, v5) from u-10bei, with loss masking applied only to the assistant's responses.

Good For

Developing AI agents for interactive, multi-step tasks.
Applications requiring robust tool use and error recovery in simulated or real-world environments.
Research into agentic LLMs and their performance in complex task execution.

Overview

Model Overview

Key Capabilities

Training Details

Good For

Full Model Card (README)