choco800/qwen3-4b-agent-v1

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 26, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

choco800/qwen3-4b-agent-v1 is a 4 billion parameter Qwen3-based instruction-tuned model, fine-tuned by choco800, specifically designed for multi-turn agent task performance. It excels at environment observation, action selection, tool use, and error recovery in tasks like household automation (ALFWorld) and database operations (DBBench). This model is optimized for complex, interactive agentic workflows, offering a specialized solution for developers building autonomous agents.

Loading preview...

choco800/qwen3-4b-agent-v1: Specialized for Multi-Turn Agent Tasks

This model is a fully merged, 4 billion parameter Qwen3-based instruction-tuned model, fine-tuned by choco800 using Unsloth. Unlike standard adapter repositories, it provides merged weights, eliminating the need to load a separate base model. Its core objective is to significantly enhance multi-turn agent task performance.

Key Capabilities

  • Agentic Trajectory Learning: Trained specifically on agent trajectories from ALFWorld (household tasks) and DBBench (database operations).
  • Comprehensive Agent Skills: Learns environment observation, action selection, tool use, and error recovery within multi-turn interactions.
  • Targeted Loss Application: Loss is applied to all assistant turns, ensuring robust learning across the entire agentic process.
  • Efficient Fine-tuning: Utilizes LoRA with Unsloth for efficient training, based on Qwen/Qwen3-4B-Instruct-2507.

Good For

  • Developing autonomous agents requiring multi-turn reasoning and interaction.
  • Applications involving tool use and error recovery in structured environments.
  • Tasks similar to household automation or database management where agents need to follow complex trajectories.

This model is ideal for developers focused on building intelligent agents that can navigate and complete multi-step tasks effectively.