xwm/ALFWorld-MPO
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kLicense:apache-2.0Architecture:Transformer0.0K Open Weights Warm

ALFWorld-MPO is an 8 billion parameter language model developed by xwm, fine-tuned from Llama-3.1-8B-Instruct. It is specifically optimized for agentic tasks within the ALFWorld environment, leveraging Meta Plan Optimization (MPO) to enhance planning and decision-making. This model demonstrates improved reward metrics and accuracy on preference pairs, making it suitable for complex interactive environments.

Loading preview...