snap-stanford/humanlm-opinion

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 12, 2026License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

HumanLM-Opinion is an 8 billion parameter user simulator developed by snap-stanford, built upon the Qwen3-8B base model and trained with GRPO on the Humanual-Opinion benchmark. This model specializes in generating opinionated responses that capture underlying user states across cognitive, normative, affective, and linguistic dimensions, rather than merely imitating surface-level language. Its primary use case is simulating diverse user feedback for research, content testing, and AI alignment, offering a 32768-token context length.

Loading preview...

HumanLM-Opinion: Simulating User States

HumanLM-Opinion, developed by snap-stanford, is an 8 billion parameter user simulator built on the Qwen3-8B base model. Unlike traditional fine-tuning that focuses on response imitation, HumanLM is trained using Group Relative Policy Optimization (GRPO) with a unique state alignment method. It explicitly models and generates responses based on six psychologically-grounded user state dimensions: belief, goal, value, stance, emotion, and communication style.

This specific checkpoint is fine-tuned on the Humanual-Opinion benchmark, which comprises Reddit users' opinionated responses in personal-issue discussion threads. The model's generation process includes a <think> block where it reasons about these latent states before producing the final response.

Key Capabilities

  • State-Aligned Response Generation: Generates responses that reflect a user's underlying beliefs, emotions, values, and communication style.
  • Opinionated User Simulation: Excels at producing diverse, opinionated feedback, particularly in discussion-based contexts.
  • Contextual Reasoning: Utilizes a <think> block to reason about latent user states, enhancing the realism and depth of simulated responses.
  • High Naturalness: Achieved a 76.6% rating of "quite natural" or "indistinguishable from human" in real-time user studies.

Good for

  • User Research: Understanding how different user personas might react to content or scenarios.
  • Content Testing: Predicting audience reactions to posts, articles, or policy drafts.
  • AI Alignment: Generating varied and realistic user feedback to train and evaluate collaborative AI systems.
  • Social Simulation: Modeling opinion dynamics and interactions within online communities.