oro-ai/qwen3-4b-shoppingbench-kto
The oro-ai/qwen3-4b-shoppingbench-kto model is a 4 billion parameter Qwen3-based language model developed by ORO-AI, fine-tuned with KTO preference refinement. It is specifically optimized for agentic tasks within the ShoppingBench environment, achieving a 42.7% ASR on a production-strict held-out partition. This model is designed for distilling shopping agent behaviors from trajectory primitives, making it suitable for automated shopping and agent simulation tasks.
Loading preview...
Overview
oro-ai/qwen3-4b-shoppingbench-kto is a 4 billion parameter language model built upon the Qwen3 architecture, developed by ORO-AI. This model has undergone KTO (Kahneman-Tversky Optimization) preference refinement, specifically tailored for agentic tasks within the ShoppingBench environment. It serves as a companion artifact for the paper "Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces".
Key Capabilities
- Specialized for ShoppingBench: Achieves a 42.7% ASR (Agent Success Rate) on a leak-cluster-guarded, production-strict held-out partition of ShoppingBench, significantly improving upon the base Qwen3-4B's 18.0% ASR.
- KTO Refinement: Utilizes KTO preference refinement (v3) on top of a merged SFT champion model, enhancing its performance in specific agentic scenarios.
- Trajectory Primitive Distillation: Designed to distill shopping agent behaviors from ShoppingBench subnet traces, making it adept at understanding and executing complex shopping-related actions.
- Ready-to-Use: Provided as a merged full model, allowing direct loading with
transformersor serving with vLLM without requiring adapter stacking.
Training Data
The model was trained using a filtered corpus from oro-ai/sn15-shoppingbench-sft-15k and raw traces from oro-ai/sn15-shoppingbench-traces-18k.
Good For
- Developing and evaluating automated shopping agents.
- Research into agentic AI and trajectory primitive distillation.
- Applications requiring specialized language understanding and generation for e-commerce and online shopping interactions.