Model Overview
This model, X1AOX1A/WorldModel-Textworld-Qwen2.5-7B, is a specialized fine-tuned version of the Qwen/Qwen2.5-7B base model, featuring 7.6 billion parameters and a 32K context window. Its primary purpose is to investigate the concept of "World Models" within text-based environments, as detailed in the associated research paper "From Word to World: Can Large Language Models be Implicit Text-based World Models?" arXiv.
Key Characteristics
- Base Model: Fine-tuned from Qwen/Qwen2.5-7B.
- Specialization: Focused on understanding and generating responses within text-based interactive scenarios, aiming to implicitly model the environment.
- Training Data: Fine-tuned on the
textworld_train_58805dataset.
Training Details
The model was trained with a learning rate of 1e-05, a total batch size of 128 (achieved with train_batch_size=2 and gradient_accumulation_steps=16 across 4 devices), and for 5 epochs. It utilized the AdamW optimizer with a constant learning rate scheduler with 10 warmup steps.
Potential Use Cases
This model is particularly relevant for research into:
- Agentic RL: Exploring how LLMs can serve as implicit world models for reinforcement learning agents in text-based games.
- Interactive Fiction: Developing more sophisticated and context-aware AI for text adventures and interactive storytelling.
- Environmental Understanding: Investigating the capacity of LLMs to build and maintain internal representations of dynamic text-based worlds.