thetmon/c19: Qwen3-4B Agentic LoRA Adapter
This repository provides a LoRA adapter for the Qwen3-4B-Instruct-2507 base model, specifically engineered to boost performance in complex, multi-turn agent tasks. Unlike general-purpose instruction models, this adapter is fine-tuned to excel in scenarios requiring sequential decision-making and interaction with environments.
Key Capabilities
- Enhanced Multi-Turn Agent Performance: Optimized for tasks that involve a series of actions and observations, such as those found in ALFWorld and DBBench.
- Agentic Reasoning: Improves the model's ability to understand environment states, select appropriate actions, and utilize tools effectively.
- Error Recovery: Training includes loss applied to all assistant turns, enabling the model to learn from and recover from errors within a trajectory.
- Specialized for ALFWorld & DBBench: Directly trained on datasets for household task automation and database operations, making it highly effective for these domains.
Training Details
- Base Model: Qwen/Qwen3-4B-Instruct-2507
- Methodology: LoRA (r=64, alpha=128) with full precision base, trained for 3 epochs.
- Max Sequence Length: 4096 tokens.
- Training Data: Utilizes u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4, both under MIT License.
When to Use This Model
This adapter is ideal for developers building AI agents that need to perform multi-step tasks, interact with environments, and demonstrate robust decision-making. It's particularly well-suited for applications in automated task execution, intelligent assistants requiring tool use, and scenarios demanding agentic capabilities over simple question-answering.