thetmon/c22: Qwen3-4B Agent LoRA Adapter
This repository provides a LoRA adapter (r=64, alpha=128) for the Qwen/Qwen3-4B-Instruct-2507 base model, developed by thetmon. It is specifically fine-tuned to significantly improve the base model's capabilities in multi-turn agent tasks.
Key Capabilities & Training Focus
- Enhanced Agent Performance: The adapter is trained to excel in complex, multi-turn agent environments, focusing on tasks requiring sequential decision-making.
- ALFWorld & DBBench Specialization: It demonstrates improved performance on household task simulations (ALFWorld) and database operation challenges (DBBench).
- Comprehensive Agent Learning: Training loss is applied across all assistant turns in a multi-turn trajectory, enabling the model to learn:
- Environment observation and interpretation.
- Effective action selection and tool use.
- Robust error recovery mechanisms.
Technical Details
- Base Model: Qwen/Qwen3-4B-Instruct-2507
- Methodology: LoRA (Low-Rank Adaptation) with a full-precision base model, utilizing Unsloth for efficient fine-tuning.
- Training Data: A mix of
u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4 datasets, both licensed under MIT. - Configuration: Trained for 3 epochs with a learning rate of 2e-04 and a maximum sequence length of 4096 tokens.
Good For
- Developers building AI agents that require robust multi-turn interaction.
- Applications involving automated task execution in simulated or real-world environments.
- Scenarios demanding improved tool use and error handling in conversational or agentic AI systems.