thetmon/c16

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 24, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

Thetmon/c16 is a 4 billion parameter LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507, specifically designed to enhance multi-turn agent task performance. This adapter excels in household tasks (ALFWorld) and database operations (DBBench) by learning environment observation, action selection, and error recovery. It focuses on improving agentic capabilities across complex, multi-step interactions.

Loading preview...

Thetmon/c16: LoRA Adapter for Enhanced Agentic Performance

The thetmon/c16 model is a LoRA adapter (r=64, alpha=128) built upon the Qwen/Qwen3-4B-Instruct-2507 base model. It was fine-tuned using LoRA and Unsloth to significantly improve performance on complex, multi-turn agent tasks.

Key Capabilities

  • Multi-turn Agent Task Performance: Specifically optimized for scenarios requiring sequential decision-making and interaction.
  • ALFWorld Proficiency: Demonstrates enhanced ability in household task environments, involving planning and execution.
  • DBBench Operations: Improved performance in database-related tasks, including query generation and manipulation.
  • Error Recovery: The training objective focused on applying loss to all assistant turns, enabling the model to learn from and recover from errors within multi-turn trajectories.
  • Tool Use and Action Selection: Designed to better understand environment observations and make appropriate action and tool selections.

Training Details

This adapter was trained for 3 epochs with a learning rate of 2e-04 and a maximum sequence length of 4096. The training data included u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4, both distributed under the MIT License.

Good For

  • Developing AI agents that require robust multi-turn interaction capabilities.
  • Applications involving complex task execution in simulated environments like ALFWorld.
  • Database interaction and automation requiring intelligent agent behavior.