X1AOX1A/WorldModel-Webshop-Qwen2.5-7B

Warm
Public
7.6B
FP8
131072
Dec 9, 2025
License: other
Hugging Face
Overview

Model Overview

X1AOX1A/WorldModel-Webshop-Qwen2.5-7B is a fine-tuned version of the Qwen/Qwen2.5-7B base model, featuring 7.6 billion parameters. This model is specifically trained on the webshop_train_70790 dataset, indicating its specialization in tasks related to webshop environments. It is developed within the context of research exploring the concept of Large Language Models (LLMs) as implicit text-based world models, particularly for their potential in scalable agentic reinforcement learning.

Key Training Details

The model was trained with a learning rate of 1e-05, a total batch size of 128 (achieved with a train_batch_size of 1 and gradient_accumulation_steps of 32), and 5 epochs. The optimizer used was AdamW with specific beta and epsilon values, and a constant learning rate scheduler with 10 warmup steps. The training utilized a multi-GPU setup with 4 devices.

Potential Applications

Given its fine-tuning on a webshop dataset, this model is likely optimized for tasks such as:

  • Automated web navigation and interaction
  • E-commerce related language understanding
  • Agentic tasks within web environments

Further details on intended uses and limitations are not explicitly provided in the original model card, but its specialized training suggests a focus on web-based agentic behaviors.