GGOSinon/babyai-world-model-7B-sft

TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Apr 23, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

GGOSinon/babyai-world-model-7B-sft is a 7.6 billion parameter world model fine-tuned from Qwen2.5-7B-Instruct, specifically designed for the BabyAI grid-world environment. It predicts the next observation and available actions given the current state and agent's action, achieving 97.1% accuracy in done-detection. This model is optimized for simulating agent-environment interactions within BabyAI, offering high precision and recall for task completion.

Loading preview...

BabyAI World Model (Qwen2.5-7B SFT)

This model is a world model specifically fine-tuned for the BabyAI grid-world environment. It is based on the Qwen2.5-7B-Instruct architecture, utilizing LoRA for efficient fine-tuning.

Key Capabilities

  • Predicts next observation: Given the current state and an agent's action, the model can accurately predict the subsequent environmental observation.
  • Predicts available actions: It identifies the actions an agent can take in the predicted next state.
  • Task completion detection: The model indicates task completion by appending "The task is completed." to the observation text.
  • High performance in BabyAI simulation: Achieves 97.1% accuracy in done-detection for 102 test cases, with 100% precision and 91.2% recall, matching Gemini 2.5 Flash zero-shot performance.

Training Details

  • Fine-tuned using LoRA with 40.4M trainable parameters (0.53% of total 7.66B).
  • Trained on the GGOSinon/babyai-world-model-sft dataset, comprising 58K transitions over 1 epoch.
  • Training took approximately 5.5 hours on a single A100 40GB GPU.

Good For

  • Simulating agent behavior within the BabyAI environment.
  • Developing and testing AI agents that interact with grid-world environments.
  • Research in world modeling for simplified, controlled environments.