langfeng01/GiGPO-Qwen2.5-7B-Instruct-WebShop

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 11, 2025License:apache-2.0Architecture:Transformer Open Weights Warm

langfeng01/GiGPO-Qwen2.5-7B-Instruct-WebShop is a 7.6 billion parameter instruction-tuned model based on the Qwen2.5 architecture, specifically trained using the GiGPO method. This model is specialized for autonomous agent operation within the WebShop e-commerce environment, designed to reason and select actions to achieve shopping goals. It features a 131072 token context length, making it suitable for complex, multi-step interactive tasks in web-based environments.

Loading preview...

Overview

GiGPO-Qwen2.5-7B-Instruct-WebShop is a 7.6 billion parameter language model built upon the Qwen2.5-7B-Instruct architecture. It has been specifically fine-tuned using the GiGPO (Generative Imitation from Guided Policy Optimization) method, as detailed in the associated arXiv paper. This model is designed for autonomous agent tasks within the WebShop e-commerce environment, enabling it to navigate, reason, and interact to fulfill shopping objectives.

Key Capabilities

  • Autonomous Agent Operation: Specialized in acting as an expert agent within the WebShop environment.
  • Step-by-Step Reasoning: Employs a <think> tag for detailed internal reasoning before taking an action.
  • Action Selection: Selects appropriate actions from a given set of admissible options, enclosed within <action> tags.
  • Contextual Awareness: Utilizes prompt templates that incorporate task descriptions, current observations, available actions, and historical interactions to inform decision-making.

Good For

  • Web-based Agent Development: Ideal for researchers and developers working on agents that interact with e-commerce platforms or similar web interfaces.
  • Reinforcement Learning from Human Feedback (RLHF) Research: Demonstrates an application of the GiGPO training methodology for complex interactive tasks.
  • Simulated E-commerce Tasks: Suitable for automating shopping processes, product search, and other related activities in a simulated environment.