Name: oro-ai/qwen3-4b-shoppingbench-kto API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: oro-ai

Overview

oro-ai/qwen3-4b-shoppingbench-kto is a 4 billion parameter language model built upon the Qwen3 architecture, developed by ORO-AI. This model has undergone KTO (Kahneman-Tversky Optimization) preference refinement, specifically tailored for agentic tasks within the ShoppingBench environment. It serves as a companion artifact for the paper "Bittensor Agent Arenas as a Trajectory Primitive: Distilling a Shopping Agent from ShoppingBench Subnet Traces".

Key Capabilities

Specialized for ShoppingBench: Achieves a 42.7% ASR (Agent Success Rate) on a leak-cluster-guarded, production-strict held-out partition of ShoppingBench, significantly improving upon the base Qwen3-4B's 18.0% ASR.
KTO Refinement: Utilizes KTO preference refinement (v3) on top of a merged SFT champion model, enhancing its performance in specific agentic scenarios.
Trajectory Primitive Distillation: Designed to distill shopping agent behaviors from ShoppingBench subnet traces, making it adept at understanding and executing complex shopping-related actions.
Ready-to-Use: Provided as a merged full model, allowing direct loading with transformers or serving with vLLM without requiring adapter stacking.

Training Data

The model was trained using a filtered corpus from oro-ai/sn15-shoppingbench-sft-15k and raw traces from oro-ai/sn15-shoppingbench-traces-18k.

Good For

Developing and evaluating automated shopping agents.
Research into agentic AI and trajectory primitive distillation.
Applications requiring specialized language understanding and generation for e-commerce and online shopping interactions.

Overview

Overview

Key Capabilities

Training Data

Good For

Full Model Card (README)