kazuyamaa/alfworld-lambda-grpo-v002-hull

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 1, 2026License:apache-2.0Architecture:Transformer Open Weights Warm

The kazuyamaa/alfworld-lambda-grpo-v002-hull is a 4 billion parameter Qwen3 model developed by kazuyamaa, fine-tuned from kazuyamaa/Qwen3-afworld-v001. This model was trained using Unsloth and Huggingface's TRL library, achieving 2x faster training speeds. It is designed for specific applications within the ALFWorld environment, leveraging its optimized training for efficient performance.

Loading preview...

Model Overview

The kazuyamaa/alfworld-lambda-grpo-v002-hull is a 4 billion parameter Qwen3 model developed by kazuyamaa. It is a fine-tuned version of kazuyamaa/Qwen3-afworld-v001, specifically optimized for tasks within the ALFWorld environment.

Key Capabilities

  • Efficient Training: This model was trained with Unsloth and Huggingface's TRL library, resulting in a 2x speed improvement during the training process.
  • ALFWorld Specialization: Fine-tuned for performance in the ALFWorld environment, suggesting enhanced capabilities for embodied AI tasks and interactive simulations.

Good For

  • ALFWorld Research: Ideal for researchers and developers working on tasks and experiments within the ALFWorld benchmark.
  • Efficient Fine-tuning: Demonstrates the effectiveness of using tools like Unsloth for accelerating model training.