Jincenzi/SocialR1-4B

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:4BQuant:BF16Ctx Length:32kPublished:May 11, 2026License:mitArchitecture:Transformer Open Weights Warm

Jincenzi/SocialR1-4B is a 4 billion parameter social reasoning model built on Qwen3-4B, developed by Jincenzi. It is trained with trajectory-level reinforcement learning (GRPO) using the Social-R1 framework to enhance social reasoning capabilities. This model aligns reasoning processes with the Social Information Processing (SIP) theory, enabling it to match or outperform larger baselines on social intelligence benchmarks. It is primarily designed for tasks requiring nuanced social inference and understanding.

Loading preview...

SocialR1-4B: Enhanced Social Reasoning Model

Jincenzi/SocialR1-4B is a 4 billion parameter model based on Qwen3-4B, specifically engineered to excel in social reasoning tasks. It leverages the Social-R1 framework and is trained using trajectory-level reinforcement learning (GRPO).

Key Capabilities & Innovations

  • SIP-Guided Reasoning: The model's reasoning process is aligned with the Social Information Processing (SIP) theory, enforcing a consistent inference flow from cue encoding to response generation.
  • Multi-Dimensional Reward System: Training incorporates a sophisticated reward mechanism that combines structural and content rewards based on SIP, inference efficiency, and format, weighted in a curriculum-style approach.
  • Strong Performance: Despite its compact 4B parameter size, SocialR1-4B demonstrates performance comparable to or exceeding substantially larger models across various social intelligence evaluations. This includes static MCQ benchmarks (e.g., ToMBench, SocialIQA), open-ended generation (FanToM), and interactive settings (SOTOPIA).

Training & Evaluation

The model was trained for 600 steps on 8 NVIDIA A100 GPUs, utilizing Group Relative Policy Optimization (GRPO). Its evaluation spans a comprehensive suite of tests, confirming its enhanced ability to understand and navigate complex social scenarios.

When to Use This Model

SocialR1-4B is particularly well-suited for applications requiring advanced social intelligence, such as understanding human intent, predicting social dynamics, or generating socially appropriate responses. Its efficiency and strong performance make it a compelling choice for integrating nuanced social reasoning into AI systems.