Model Overview
This model, eugeneyan/semantic-id-qwen3-8b-video-games, is an 8 billion parameter Qwen3-based language model specifically fine-tuned for video game product recommendation. Its core innovation lies in the use of Semantic IDs, which are learned hierarchical representations that embed product similarities directly into their structure. Unlike traditional product IDs, Semantic IDs carry meaning, allowing similar products to share similar ID prefixes.
Key Capabilities
- Generative Recommendation: Generates Semantic IDs for recommended video games based on user input or natural language queries.
- Semantic ID Understanding: Processes and generates special tokens (
<|sid_start|>, <|sid_X|>, <|sid_end|>, <|rec|>) to work with 4-level hierarchical Semantic IDs. - Contextual Recommendations: Supports recommendations based on past interactions, natural language descriptions (e.g., "I like scifi and action games"), and attribute-steered queries (e.g., "Recommend Xbox games similar to...").
- Explanatory Recommendations: Can generate natural language explanations alongside Semantic ID recommendations.
- Multi-Turn Conversations: Maintains context across multiple turns for refined recommendations.
Training Details
The model was fine-tuned using Supervised Fine-Tuning (SFT) on the Amazon Video Games reviews and metadata dataset, encompassing 66,097 unique products over 2 epochs. It is specifically trained for video games, with Semantic IDs fixed from training time, and requires a separate mapping dataset to interpret the generated IDs into product titles.
Good For
- Building video game recommendation systems that leverage semantic similarity.
- Generative retrieval of video game products based on learned hierarchical identifiers.
- Applications requiring context-aware and attribute-steered product suggestions within the video game domain.