Overview
This repository provides a LoRA adapter (r=64, alpha=128) for the Qwen3-4B-Instruct-2507 base model, developed by thetmon. It is specifically fine-tuned using LoRA and Unsloth to improve the base model's performance on multi-turn agent tasks. This adapter contains only the LoRA weights and requires the base model to be loaded separately.
Key Capabilities
- Enhanced Multi-Turn Agent Performance: Significantly improves the base model's ability to handle sequential, interactive tasks.
- Specialized for Agentic Workflows: Optimized for tasks requiring environment observation, action selection, tool use, and error recovery within multi-turn trajectories.
- Domain-Specific Improvement: Fine-tuned on a mix of ALFWorld (household tasks) and DBBench (database operations) datasets, making it suitable for similar agentic applications.
- Efficient Fine-tuning: Utilizes LoRA with a full precision base, trained for 3 epochs with a learning rate of 2e-04 and a max sequence length of 4096.
Good For
- Agent-based applications: Ideal for developers building AI agents that need to perform complex, multi-step tasks.
- Interactive environments: Suitable for scenarios where the model needs to interact with an environment, observe results, and adapt its actions.
- Tool use and planning: Particularly effective for tasks that involve selecting and using tools or planning sequences of actions.
- Improving Qwen3-4B-Instruct-2507's agentic capabilities: A direct enhancement for the specified base model in agent-related use cases.