Name: Joinn/UserMirrorrer-Qwen-DPO API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Joinn

Overview

Joinn/UserMirrorrer-Qwen-DPO is a 3.1 billion parameter model built upon the Qwen-2.5-3B-Instruct base, developed by Joinn. Its primary purpose is to function as a preference-aligned user simulator within recommender systems (RSs). The model was introduced in the paper "Mirroring Users: Towards Building Preference-aligned User Simulator with User Feedback in Recommendation" and is designed to simulate user behavior by incorporating user feedback.

Key Capabilities and Features

User Simulation: Specifically engineered to mimic user behavior in recommender systems.
Preference Alignment: Achieves better alignment with user preferences through extensive feedback integration.
Explanatory Rationales: Utilizes decision-making processes as explanatory rationales to clarify simulation samples.
Fine-tuning: Underwent a two-stage fine-tuning process:
- Supervised Finetuning (SFT) for 1 epoch.
- Direct Preference Optimization (DPO) for 2 epochs.
Dataset: Trained on the UserMirrorer dataset.

Use Cases

This model is particularly well-suited for:

Recommender System Research: Developing and testing new recommender algorithms by simulating realistic user interactions.
User Behavior Modeling: Gaining insights into how users might interact with recommended items.
Reducing Ambiguity: Generating simulation samples with clearer, more interpretable decision-making processes.

For more technical details, refer to the associated research paper.

Overview

Overview

Key Capabilities and Features

Use Cases

Full Model Card (README)