Overview
plotMaker/qwen25-7b-sft-merged-v5v6-a50 is a 7.6 billion parameter language model built upon the Qwen2.5-7B-Instruct base. It was developed by plotMaker through a unique process involving QLoRA and Unsloth fine-tuning, followed by weight interpolation (alpha=0.5) of two distinct SFT models (v5 and v6). This results in a fully merged model that requires no additional adapters or weights.
Key Capabilities
- Enhanced Multi-Turn Agent Performance: Specifically trained to improve performance on complex multi-turn agent tasks, such as those found in ALFWorld (household tasks) and DBBench (database operations).
- Robust Agentic Behavior: Optimized for learning environment observation, action selection, effective tool use, and recovery from errors within multi-turn trajectories.
- Efficient Fine-tuning: Utilizes QLoRA (4-bit) and Unsloth for efficient training, with a maximum sequence length of 2048 and a learning rate of 5e-5.
Good For
- Developing AI Agents: Ideal for applications requiring models to perform sequential, interactive tasks and manage state over multiple turns.
- Complex Task Automation: Suitable for scenarios where an agent needs to observe, act, and adapt to environmental feedback, such as in simulated environments or database interactions.
- Research in Agentic LLMs: Provides a strong base for further experimentation and development in the field of LLM-powered agents, particularly those focused on tool use and error handling.