thetmon/c17
Thetmon/c17 is a 4 billion parameter LoRA adapter for the Qwen3-4B-Instruct-2507 base model, specifically fine-tuned to enhance multi-turn agent task performance. This adapter excels in complex environments like ALFWorld (household tasks) and DBBench (database operations), focusing on improving environment observation, action selection, tool use, and error recovery. It is designed to be loaded on top of the Qwen3-4B-Instruct-2507 base model to provide specialized agentic capabilities.
Loading preview...
Overview
This repository provides a LoRA adapter (r=64, alpha=128) for the Qwen3-4B-Instruct-2507 base model, developed by thetmon. It is specifically fine-tuned using LoRA and Unsloth to improve the base model's performance on multi-turn agent tasks. This adapter contains only the LoRA weights and requires the base model to be loaded separately.
Key Capabilities
- Enhanced Multi-Turn Agent Performance: Significantly improves the base model's ability to handle sequential, interactive tasks.
- Specialized for Agentic Workflows: Optimized for tasks requiring environment observation, action selection, tool use, and error recovery within multi-turn trajectories.
- Domain-Specific Improvement: Fine-tuned on a mix of ALFWorld (household tasks) and DBBench (database operations) datasets, making it suitable for similar agentic applications.
- Efficient Fine-tuning: Utilizes LoRA with a full precision base, trained for 3 epochs with a learning rate of 2e-04 and a max sequence length of 4096.
Good For
- Agent-based applications: Ideal for developers building AI agents that need to perform complex, multi-step tasks.
- Interactive environments: Suitable for scenarios where the model needs to interact with an environment, observe results, and adapt its actions.
- Tool use and planning: Particularly effective for tasks that involve selecting and using tools or planning sequences of actions.
- Improving Qwen3-4B-Instruct-2507's agentic capabilities: A direct enhancement for the specified base model in agent-related use cases.