Model Overview
X1AOX1A/WorldModel-Webshop-Qwen2.5-7B is a fine-tuned version of the Qwen/Qwen2.5-7B base model, featuring 7.6 billion parameters. This model is specifically trained on the webshop_train_70790 dataset, indicating its specialization in tasks related to webshop environments. It is developed within the context of research exploring the concept of Large Language Models (LLMs) as implicit text-based world models, particularly for their potential in scalable agentic reinforcement learning.
Key Training Details
The model was trained with a learning rate of 1e-05, a total batch size of 128 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 32), and 5 epochs. The optimizer used was AdamW with specific beta and epsilon values, and a constant learning rate scheduler with 10 warmup steps. The training utilized a multi-GPU setup with 4 devices.
Potential Applications
Given its fine-tuning on a webshop dataset, this model is likely optimized for tasks such as:
- Automated web navigation and interaction
- E-commerce related language understanding
- Agentic tasks within web environments
Further details on intended uses and limitations are not explicitly provided in the original model card, but its specialized training suggests a focus on web-based agentic behaviors.