sunweiwei/Ditto-8B
Ditto-8B is an 8 billion parameter open-weight model developed by sunweiwei, specifically designed for advanced human behavior simulation. It excels in tasks requiring theory of mind, character role-play, and social skills, covering areas like learner and user simulation. The model is trained using DITTO, a reinforcement learning method that leverages verbal feedback to distill guidance into its policy, enabling robust performance without explicit feedback during inference. It demonstrates strong performance across various conversational, social, and cognitive benchmarks related to human simulation.
Loading preview...
Ditto-8B: Advanced Human Behavior Simulation
Ditto-8B is an 8 billion parameter model developed by sunweiwei, specifically engineered for comprehensive human behavior simulation. It addresses a wide range of applications including theory of mind, character role-play, social skills, and user/persona simulation.
Key Capabilities & Training
- Human Behavior Simulation: Optimized for tasks that require understanding and mimicking human interaction and cognition.
- DITTO Training Method: Utilizes a novel reinforcement learning approach called DITTO, which incorporates verbal feedback during training. This method allows the model to learn and internalize descriptive feedback, eliminating the need for explicit feedback during inference.
- Strong Benchmark Performance: Outperforms several larger models, including Qwen3 8B Inst and some proprietary models, on specific human simulation benchmarks like UserLLM, MirrorBench, and Sotopia-Hard.
Good for
- Developing AI agents that require realistic human-like interaction and social understanding.
- Creating sophisticated chatbots for role-playing, customer service, or educational simulations.
- Research in AI psychology and understanding complex social dynamics through simulation.
- Applications needing robust character generation and consistent persona maintenance.