thetmon/c22
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 26, 2026License:apache-2.0Architecture:Transformer Open Weights Cold
The thetmon/c22 is a 4 billion parameter LoRA adapter for the Qwen3-4B-Instruct-2507 base model, fine-tuned by thetmon. This adapter specializes in enhancing multi-turn agent task performance, particularly in household tasks (ALFWorld) and database operations (DBBench). It improves the model's ability to learn environment observation, action selection, tool use, and error recovery within complex multi-turn trajectories.
Loading preview...
thetmon/c22: Qwen3-4B Agent LoRA Adapter
This repository provides a LoRA adapter (r=64, alpha=128) for the Qwen/Qwen3-4B-Instruct-2507 base model, developed by thetmon. It is specifically fine-tuned to significantly improve the base model's capabilities in multi-turn agent tasks.
Key Capabilities & Training Focus
- Enhanced Agent Performance: The adapter is trained to excel in complex, multi-turn agent environments, focusing on tasks requiring sequential decision-making.
- ALFWorld & DBBench Specialization: It demonstrates improved performance on household task simulations (ALFWorld) and database operation challenges (DBBench).
- Comprehensive Agent Learning: Training loss is applied across all assistant turns in a multi-turn trajectory, enabling the model to learn:
- Environment observation and interpretation.
- Effective action selection and tool use.
- Robust error recovery mechanisms.
Technical Details
- Base Model: Qwen/Qwen3-4B-Instruct-2507
- Methodology: LoRA (Low-Rank Adaptation) with a full-precision base model, utilizing Unsloth for efficient fine-tuning.
- Training Data: A mix of
u-10bei/sft_alfworld_trajectory_dataset_v5andu-10bei/dbbench_sft_dataset_react_v4datasets, both licensed under MIT. - Configuration: Trained for 3 epochs with a learning rate of 2e-04 and a maximum sequence length of 4096 tokens.
Good For
- Developers building AI agents that require robust multi-turn interaction.
- Applications involving automated task execution in simulated or real-world environments.
- Scenarios demanding improved tool use and error handling in conversational or agentic AI systems.