kazuyamaa/alfworld-lambda-grpo-v004
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:Mar 1, 2026License:apache-2.0Architecture:Transformer Open Weights Warm
The kazuyamaa/alfworld-lambda-grpo-v004 is a 4 billion parameter Qwen3 model developed by kazuyamaa. This model was finetuned from kazuyamaa/alfworld-lambda-grpo-v002-hull and optimized for training speed using Unsloth and Huggingface's TRL library. It is designed for tasks related to the ALFWorld environment, leveraging its specialized finetuning for improved performance in interactive text-based games.
Loading preview...