Model Overview
X1AOX1A/WorldModel-Alfworld-Qwen2.5-7B is a specialized language model, fine-tuned from the Qwen2.5-7B architecture, with 7.6 billion parameters. Its core purpose is to investigate the concept of "World Models" within large language models, specifically in text-based environments. This model is a component of the research presented in the paper "From Word to World: Can Large Language Models be Implicit Text-based World Models?" arXiv:2512.18832.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen2.5-7B.
- Domain Specificity: Optimized for the
alfworld_train_with_env_54006 dataset, focusing on the Alfworld text-based environment. - Research Focus: Designed to explore the capacity of LLMs to act as implicit world models for agentic reinforcement learning.
Training Details
The model was trained with a learning rate of 1e-05, a total batch size of 128 (achieved with train_batch_size=2 and gradient_accumulation_steps=16), and for 5 epochs. It utilized a constant learning rate scheduler with 10 warmup steps. The training was conducted on a multi-GPU setup (4 devices).
Intended Use Cases
- Research: Ideal for academic and research purposes, particularly in the fields of AI agents, reinforcement learning, and understanding LLM internal representations.
- Alfworld Tasks: Suitable for experiments and development within the Alfworld text-based game environment.