Jihyung803/Qwen3-8B-SOCIALIQA-DPO
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Mar 19, 2026Architecture:Transformer Cold
Jihyung803/Qwen3-8B-SOCIALIQA-DPO is an 8 billion parameter language model fine-tuned from Qwen/Qwen3-8B using Direct Preference Optimization (DPO). This model is specifically trained to enhance its ability to generate responses aligned with human preferences, particularly in social intelligence and conversational contexts. It leverages a 32K token context length, making it suitable for nuanced and extended interactions. The DPO training method aims to improve the model's helpfulness and harmlessness in open-ended dialogue.
Loading preview...