Model Overview
This model, curio184/qwen25-7b-agent-exp02-C_alfv3_dbv4, is a 7.6 billion parameter language model derived from Qwen/Qwen2.5-7B-Instruct. It has been fine-tuned using LoRA with Unsloth to enhance its performance in complex, multi-turn agent tasks.
Key Capabilities
- Multi-turn Agent Performance: Specifically trained to improve interaction and task completion over multiple conversational turns.
- Specialized Task Domains: Optimized for two distinct agentic domains:
- ALFWorld: Excels in household task automation and understanding.
- DBBench: Proficient in database operations and interactions.
- Full Merged Weights: The repository provides the complete merged weights, eliminating the need for separate adapter loading during deployment.
Training Details
The model was fine-tuned on a maximum sequence length of 2048 tokens for 2 epochs, utilizing a learning rate of 2e-06. Loss was applied to all assistant turns within the multi-turn trajectories to reinforce agentic behavior. The training data included alfworld_v3_fixed and dbbench_v4 datasets, licensed under MIT.
Good For
- Developing agents that require robust multi-turn interaction capabilities.
- Applications involving automated household tasks or database management.
- Researchers and developers looking for a specialized agent model based on the Qwen2.5 architecture.