Name: flavianv/deepoutfit-qwen17b-sft-dpo API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: flavianv

DeepOutfit Qwen1.7B SFT+DPO Overview

This model, flavianv/deepoutfit-qwen17b-sft-dpo, is an experimental 2 billion parameter full-weight fine-tune of Qwen/Qwen3-1.7B. Its primary purpose is to generate structured JSON-action outfit recommendation traces, specifically for fashion/outfit agents that interact with a product catalog.

Key Capabilities & Training

Specialized Fine-tuning: The model underwent Supervised Fine-Tuning (SFT) and Direct Preference Optimization (DPO) stages using filtered JSON-action outfit rollouts.
Outfit Report Generation: It is designed to search a product catalog, select five products, and produce a structured final outfit report.
Context Length: Trained with a maximum length of 16,384 tokens during SFT and 8,192 tokens during DPO.
Performance Improvement: Evaluation against the zero-shot Qwen3-1.7B showed an improvement in GPT-4.1 judged outfit quality, with a mean judge score of 41.58 compared to 29.93 for the base model.

Intended Use Cases & Limitations

Research on Fashion Agents: Ideal for research and development of catalog-grounded fashion agents that require structured output for outfit recommendations.
External Tooling Required: It is not a general-purpose shopping assistant; it requires an external tool loop for product search results and a validator/scorer for the final JSON report.
Experimental Status: This is an experimental research checkpoint, not validated for production. It may produce incomplete, impractical, or unsupported product combinations.
Specific Optimization: Optimized for outfit/product-report behavior, not broad assistant quality. The dominant remaining failure mode identified is outfit practicality.

Overview

DeepOutfit Qwen1.7B SFT+DPO Overview

Key Capabilities & Training

Intended Use Cases & Limitations

Full Model Card (README)