thetmon/c11

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 23, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The thetmon/c11 is a 4 billion parameter LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507, designed to enhance multi-turn agent task performance. This adapter specializes in improving capabilities for household tasks (ALFWorld) and database operations (DBBench). It focuses on learning environment observation, action selection, tool use, and error recovery within multi-turn trajectories, making it suitable for agentic applications.

Loading preview...

Qwen3-4B ALFWorld+DBBench Mixed LoRA Adapter

This repository provides a LoRA adapter (r=64) fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model using LoRA and Unsloth. It's important to note that this repository contains only the LoRA adapter weights, and the base model must be loaded separately.

Key Capabilities & Training Objective

The primary objective of this adapter is to significantly improve multi-turn agent task performance. It has been specifically trained on a mix of ALFWorld (household tasks) and DBBench (database operations) datasets. The training methodology applies loss to all assistant turns within a multi-turn trajectory, which enables the model to effectively learn:

  • Environment observation
  • Intelligent action selection
  • Proficient tool use
  • Robust error recovery mechanisms

This makes the adapter particularly adept at handling complex, sequential agentic workflows.

Training Configuration Highlights

  • Base Model: Qwen/Qwen3-4B-Instruct-2507
  • Method: LoRA (full precision base)
  • Max Sequence Length: 4096 tokens
  • LoRA Parameters: r=64, alpha=128

Usage Considerations

Users must comply with the MIT License for the training datasets (u-10bei/sft_alfworld_trajectory_dataset_v5 and u-10bei/dbbench_sft_dataset_react_v4) and the original terms of use for the Qwen3-4B-Instruct-2507 base model.