Jincenzi/SocialR1-8B

TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kTool Calling:SupportedPublished:May 11, 2026License:mitArchitecture:Transformer0.0K Open Weights Cold

Jincenzi/SocialR1-8B is an 8 billion parameter social reasoning model developed by Jincenzi, built upon the Qwen3-8B architecture. It is specifically trained with trajectory-level reinforcement learning (GRPO) using the Social-R1 framework to significantly enhance Theory-of-Mind (ToM) and social inference capabilities. This model excels at aligning reasoning processes with Social Information Processing (SIP) theory, making it highly effective for tasks requiring nuanced social understanding.

Loading preview...

SocialR1-8B: Enhanced Social Reasoning Model

Jincenzi/SocialR1-8B is an 8 billion parameter model based on Qwen3-8B, specifically designed to improve social reasoning and Theory-of-Mind (ToM) capabilities in large language models. Developed by Jincenzi, this model utilizes the innovative Social-R1 framework, which incorporates trajectory-level reinforcement learning (GRPO) to align its reasoning processes with the Social Information Processing (SIP) theory.

Key Capabilities and Features

  • SIP-Guided Reasoning: Employs a structured social inference process, moving from Cue Encoding to Cue Interpretation, Goal Clarification, and finally Response Generation.
  • Multi-Dimensional Reward System: Integrates structural, content, inference efficiency, and format rewards with a curriculum-style weighting during training.
  • Strong Performance: Demonstrates competitive or superior performance against larger models across various social reasoning benchmarks, including static MCQ tests (ToMBench, SocialIQA), open-ended generation (FanToM), and interactive social intelligence tasks (SOTOPIA).

Training and Evaluation

SocialR1-8B was trained using Group Relative Policy Optimization (GRPO) over 600 steps on 8 NVIDIA A100 GPUs. Its evaluation spans a comprehensive suite of tests to ensure robust social intelligence across different scenarios. The model's development is detailed in the paper: Social-R1: Enhancing Social Reasoning in LLMs through Trajectory-Level Reinforcement Learning.

Ideal Use Cases

This model is particularly well-suited for applications requiring advanced social understanding, such as empathetic AI, social simulation, character interaction in games, and any scenario where an LLM needs to infer and respond based on complex social cues and intentions.