langfeng01/GiGPO-Qwen2.5-7B-Instruct-ALFWorld

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Jun 11, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The langfeng01/GiGPO-Qwen2.5-7B-Instruct-ALFWorld is a 7.6 billion parameter instruction-tuned language model based on the Qwen2.5 architecture. Developed by langfeng01, this model is specifically trained using the GiGPO method for embodied AI tasks within the ALFRED Embodied Environment. It excels at sequential decision-making and action generation in interactive simulated environments, leveraging a 131,072 token context length for complex task execution.

Loading preview...

Model Overview

langfeng01/GiGPO-Qwen2.5-7B-Instruct-ALFWorld is a 7.6 billion parameter instruction-tuned language model built upon the Qwen2.5 architecture. This model is uniquely specialized for embodied AI tasks, particularly within the ALFRED Embodied Environment. It leverages a substantial context length of 131,072 tokens, enabling it to process extensive observational histories and task descriptions for complex sequential decision-making.

Key Capabilities

  • Embodied AI Task Execution: Specifically trained to operate as an expert agent in the ALFRED Embodied Environment, handling tasks that require understanding observations and generating appropriate actions.
  • GiGPO Training: Utilizes the GiGPO (Generative Imitation Guided Policy Optimization) method, as detailed in the associated arXiv paper, to enhance its performance in interactive environments.
  • Structured Reasoning and Action: Designed to perform step-by-step reasoning within <think> tags before selecting an admissible action, presented within <action> tags, following a specific prompt template.
  • Contextual Understanding: Benefits from its large context window to maintain a detailed history of observations and actions, crucial for navigating and completing multi-step tasks.

Good For

  • Research in Embodied AI: Ideal for researchers and developers working on agents for simulated environments like ALFRED.
  • Sequential Decision-Making: Applications requiring an agent to reason and act based on a series of observations and a history of actions.
  • Developing Intelligent Agents: Useful for building agents that can interpret complex instructions and execute tasks in interactive settings.