alphaXiv/sdpo-tau-retail-sft-qwen3-4b

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:Jun 24, 2026Architecture:Transformer Cold

The alphaXiv/sdpo-tau-retail-sft-qwen3-4b is a 4 billion parameter language model based on the Qwen3 architecture, developed by alphaXiv. This model is a fine-tuned version, likely optimized for specific retail-related tasks through supervised fine-tuning (SFT) and potentially Direct Preference Optimization (DPO). With a context length of 32768 tokens, it is designed for applications requiring processing of extensive text inputs within a retail context.

Loading preview...

Model Overview

The alphaXiv/sdpo-tau-retail-sft-qwen3-4b is a 4 billion parameter language model built upon the Qwen3 architecture. This model has undergone supervised fine-tuning (SFT) and potentially Direct Preference Optimization (DPO), suggesting an optimization for specific use cases, likely within the retail domain.

Key Characteristics

  • Architecture: Qwen3-based, indicating a robust foundation for language understanding and generation.
  • Parameter Count: 4 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a substantial context window of 32768 tokens, enabling the processing of long documents or complex conversational histories.
  • Fine-tuning: The sdpo-tau-retail-sft in its name implies specialized training, likely for retail-specific tasks, leveraging supervised fine-tuning and potentially advanced preference alignment techniques.

Potential Use Cases

Given its likely retail-specific fine-tuning, this model could be particularly effective for:

  • Customer Service Automation: Handling retail-specific queries, product information, and support.
  • Product Description Generation: Creating detailed and engaging descriptions for e-commerce platforms.
  • Retail Analytics: Processing and summarizing customer feedback, reviews, or market trends.
  • Personalized Recommendations: Assisting in generating tailored product suggestions based on user data.

Further details regarding its specific training data, evaluation metrics, and intended applications are currently marked as "More Information Needed" in the model card.