DCAgent/a1-agenttuning_alfworld

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 26, 2026License:otherArchitecture:Transformer Cold

DCAgent/a1-agenttuning_alfworld is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B. This model is specifically optimized for agent-tuning within the AlfWorld environment, leveraging a dataset derived from neulab-agenttuning-alfworld-sandboxes_glm_4.7_traces_jupiter. It is designed to enhance performance in interactive text-based environments, focusing on agentic capabilities.

Loading preview...

Overview

DCAgent/a1-agenttuning_alfworld is an 8 billion parameter model built upon the Qwen3-8B architecture. It has been fine-tuned using a specialized dataset, /e/scratch/jureap59/raoof1/sft_data/hf_hub/datasets--DCAgent--neulab-agenttuning-alfworld-sandboxes_glm_4.7_traces_jupiter/snapshots/e97f6ac19973c2efc4d0ee484ad47f57f59a6d06_thinking_preprocessed, which suggests a focus on agentic behavior within the AlfWorld environment.

Key Capabilities

  • Agent-tuning: Specifically fine-tuned for tasks requiring agentic reasoning and interaction.
  • AlfWorld Optimization: Designed to perform effectively within the AlfWorld text-based game environment.
  • Qwen3-8B Base: Benefits from the foundational capabilities of the Qwen3-8B model.

Training Details

The model was trained with a learning rate of 4e-05 over 7 epochs, utilizing a total batch size of 16 across 16 GPUs. It employed the AdamW_Torch_Fused optimizer with a cosine learning rate scheduler and a warmup ratio of 0.1. The training leveraged Transformers 4.57.6 and PyTorch 2.9.1+cu130.

Good For

  • Research in Agentic AI: Ideal for researchers exploring agent behavior and decision-making in interactive environments.
  • AlfWorld Tasks: Suitable for applications requiring strong performance in the AlfWorld benchmark.
  • Fine-tuning Experiments: Provides a base for further experimentation on agent-tuning methodologies.