Name: choco800/qwen3-4b-agent-v8 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: choco800

Model Overview

The choco800/qwen3-4b-agent-v8 is a 4 billion parameter language model, fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model. This repository provides a fully merged model, meaning it includes the base model weights and does not require separate loading of adapters. It was trained using LoRA with Unsloth, resulting in a 16-bit merged model.

Key Capabilities

Enhanced Agentic Performance: Specifically trained to improve multi-turn agent task performance.
Task Domains: Optimized for tasks in ALFWorld (household tasks) and DBBench (database operations).
Learning Trajectory: The model learns from all assistant turns in a multi-turn trajectory, covering environment observation, action selection, tool use, and error recovery.
Context Length: Supports a maximum sequence length of 8192 tokens during training.

Training Details

The model was trained for 1 epoch with a learning rate of 1e-05. Loss was applied exclusively to the assistant's responses, masking user prompts and observations. The training utilized specific datasets including u-10bei/dbbench_sft_dataset_react, u-10bei/dbbench_sft_dataset_react_v3, and u-10bei/dbbench_sft_dataset_react_v4, all distributed under the MIT License. Users must comply with both dataset and base model (Apache 2.0) licenses.

Overview

Model Overview

Key Capabilities

Training Details

Full Model Card (README)