Name: oro-ai/qwen3-4b-shoppingbench-rejection API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: oro-ai

Overview

This model, oro-ai/qwen3-4b-shoppingbench-rejection, is a 4 billion parameter variant of the Qwen3 architecture, developed by ORO-AI. It represents the second stage of fine-tuning within the ShoppingBench distillation pipeline, utilizing reward-weighted rejection sampling. The primary goal of this fine-tuning is to enhance the model's performance in agentic shopping scenarios.

Key Capabilities

Enhanced Agent Success Rate (ASR): The model achieves a 42.7% ASR on ShoppingBench, a substantial improvement over the base Qwen3-4B's 18.0% ASR. This metric is evaluated on a leak-cluster-guarded, held-out partition with production-strict scoring.
Rejection-Sampled Fine-tuning: It leverages reward-weighted rejection sampling, a technique designed to distill high-performing agent trajectories.
Ready-to-Use: This is a merged full model, meaning the Qwen3-4B base weights are integrated with the trained delta, allowing direct loading with transformers or serving with vLLM without requiring adapter stacking.

Training Data

Fine-tuned on a filtered corpus: oro-ai/sn15-shoppingbench-sft-15k
Utilizes raw traces from: oro-ai/sn15-shoppingbench-traces-18k

Good For

Developing and deploying automated shopping agents.
Research into trajectory primitive distillation and reward-weighted fine-tuning for agentic tasks.
Applications requiring a specialized language model for e-commerce interactions and decision-making.

Overview

Overview

Key Capabilities

Training Data

Good For

Full Model Card (README)