Jincenzi/SocialR1-8B
Jincenzi/SocialR1-8B is a 4 billion parameter social reasoning model built on Qwen3-8B, trained with trajectory-level reinforcement learning (GRPO) using the Social-R1 framework. It enhances Theory-of-Mind (ToM) and social inference capabilities by aligning reasoning processes with the Social Information Processing (SIP) theory. This model excels at social reasoning tasks, matching or outperforming larger baselines across static MCQ benchmarks, open-ended generation, and interactive settings. It features a 32768 token context length, making it suitable for complex social interaction simulations and analyses.
Loading preview...
Jincenzi/SocialR1-8B: Enhanced Social Reasoning LLM
Jincenzi/SocialR1-8B is a 4 billion parameter language model developed by Jincenzi, specifically designed to excel in social reasoning tasks. Built upon the Qwen3-8B architecture, this model leverages the Social-R1 framework and trajectory-level reinforcement learning (GRPO) to significantly improve its Theory-of-Mind (ToM) and social inference capabilities. Its training aligns reasoning processes with the Social Information Processing (SIP) theory, enabling more nuanced and context-aware social understanding.
Key Capabilities & Innovations
- SIP-Guided Reasoning: The model's inference process is structured according to the Social Information Processing (SIP) theory, progressing through stages like Cue Encoding, Cue Interpretation, Goal Clarification, and Response Generation. This structured approach enhances the consistency and accuracy of social inferences.
- Multi-Dimensional Reward System: Training incorporates a sophisticated reward mechanism that combines structural reward, content reward, inference efficiency, and format reward. This multi-faceted approach, weighted with a curriculum-style methodology, optimizes the model for comprehensive social intelligence.
- Strong Performance: SocialR1-8B demonstrates competitive performance, matching or surpassing larger baseline models on various social intelligence benchmarks. This includes static multiple-choice question (MCQ) tests (e.g., ToMBench, SocialIQA), open-ended generation tasks (FanToM), and interactive social intelligence environments (SOTOPIA).
Ideal Use Cases
This model is particularly well-suited for applications requiring advanced social understanding and interaction. Developers should consider SocialR1-8B for:
- Social Simulation: Creating agents that can understand and respond to complex social cues and scenarios.
- Psychological Research: Modeling human social cognition and Theory-of-Mind.
- Interactive Storytelling & Roleplay: Generating more believable and socially intelligent character interactions.
- Ethical AI Development: Building systems that can better interpret social contexts and potential impacts of their actions.
For more details, refer to the accompanying research paper: Social-R1: Enhancing Social Reasoning in LLMs through Trajectory-Level Reinforcement Learning.