oro-ai/qwen3-4b-shoppingbench-sft

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 8, 2026License:apache-2.0Architecture:Transformer Open Weights Cold

The oro-ai/qwen3-4b-shoppingbench-sft is a 4 billion parameter Qwen3-based supervised fine-tuned language model developed by ORO-AI. It is specifically optimized for shopping agent tasks, achieving a 42.7% ASR on a leak-cluster-guarded held-out partition of the ShoppingBench SN15 corpus. This model is designed to distill a shopping agent from ShoppingBench subnet traces, making it suitable for e-commerce automation and agentic applications.

Loading preview...

Model Overview

The oro-ai/qwen3-4b-shoppingbench-sft is a 4-billion parameter language model based on the Qwen3 architecture, developed by ORO-AI. It has undergone supervised fine-tuning (SFT) using the leak-cluster-guarded ShoppingBench SN15 corpus. This model serves as a companion artifact for the research paper "Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces."

Key Capabilities

  • Enhanced Shopping Agent Performance: Significantly improves upon the base Qwen3-4B model's performance on ShoppingBench, achieving a 42.7% Agent Success Rate (ASR) on a production-strict, held-out partition, compared to the base model's 18.0% ASR.
  • Trajectory Primitive Distillation: Designed to distill a shopping agent from complex ShoppingBench subnet traces, enabling more effective automated shopping interactions.
  • Ready-to-Use Integration: Provided as a merged full model, allowing direct loading with transformers or serving with vLLM without the need for adapter stacking.

Training Data

The model was fine-tuned on a filtered corpus, oro-ai/sn15-shoppingbench-sft-15k, derived from raw traces available at oro-ai/sn15-shoppingbench-traces-18k.

Good For

  • Developing and deploying automated shopping agents.
  • Research into agentic behavior and trajectory primitives in e-commerce.
  • Applications requiring specialized language understanding and generation for online shopping scenarios.