thetmon/alfv5

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Feb 18, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The thetmon/alfv5 is a 4 billion parameter LoRA adapter, fine-tuned from Qwen/Qwen3-4B-Instruct-2507, designed to enhance multi-turn agent task performance. It specializes in improving capabilities for household tasks (ALFWorld) and database operations (DBBench). This adapter focuses on learning environment observation, action selection, tool use, and error recovery within multi-turn trajectories.

Loading preview...

Overview

This repository provides a LoRA adapter (r=16) fine-tuned from the Qwen/Qwen3-4B-Instruct-2507 base model. It is specifically designed to improve the base model's performance in complex, multi-turn agent tasks.

Key Capabilities

  • Enhanced Multi-Turn Agent Performance: The adapter is trained to improve the model's ability to handle sequential, interactive tasks.
  • Task Specialization: Optimized for two distinct domains:
    • ALFWorld: Household task execution, involving understanding environments and performing actions.
    • DBBench: Database operations, likely including query generation, execution, and result interpretation.
  • Comprehensive Learning: The training objective applies loss to all assistant turns, enabling the model to learn:
    • Environment observation
    • Action selection
    • Tool use
    • Error recovery mechanisms

Training Details

  • Base Model: Qwen/Qwen3-4B-Instruct-2507
  • Method: LoRA (Low-Rank Adaptation) with full precision base model
  • Max Sequence Length: 2048 tokens
  • Dataset: Trained on the u-10bei/sft_alfworld_trajectory_dataset_v5, which is licensed under MIT.

Usage Notes

This repository contains only the LoRA adapter weights. Users must load the base model (Qwen/Qwen3-4B-Instruct-2507) separately and then merge the adapter for full functionality.