choco800/qwen3-4b-agent-v14

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 1, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The choco800/qwen3-4b-agent-v14 is a 4 billion parameter Qwen3-Instruct model fine-tuned for multi-turn agent task performance. This fully merged model, developed by choco800, excels at environment observation, action selection, tool use, and error recovery within agent trajectories. It is specifically optimized for tasks like those found in ALFWorld, making it suitable for complex, interactive AI agent applications.

Loading preview...

Overview

This model, choco800/qwen3-4b-agent-v14, is a 4 billion parameter Qwen3-Instruct variant, specifically fine-tuned to enhance multi-turn agent task performance. Unlike typical adapter repositories, this is a fully merged model, meaning it contains all necessary weights and does not require loading a separate base model. It was trained using LoRA and Unsloth, with a maximum sequence length of 8192 tokens.

Key Capabilities

  • Multi-turn Agent Task Performance: Optimized for complex, sequential tasks requiring multiple interactions.
  • Environment Observation: Capable of processing and understanding environmental cues.
  • Action Selection & Tool Use: Designed to make appropriate decisions and utilize tools within an agentic workflow.
  • Error Recovery: Trained to handle and recover from errors encountered during task execution.
  • Efficient Deployment: Provided as a fully merged model, simplifying integration and usage.

Training Focus

The model's training objective was to improve performance on agent tasks, particularly within environments like ALFWorld (household tasks). Loss was applied to all assistant turns in the multi-turn trajectory, ensuring comprehensive learning across observation, action, tool use, and error handling. The training utilized several versions of the dbbench_sft_dataset_react dataset, licensed under MIT.