thu-coai/CharacterJudge

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:7.6BQuant:FP8Ctx Length:32kPublished:Mar 22, 2025Architecture:Transformer0.0K Warm

CharacterJudge is a 7.6 billion parameter judge model developed by thu-coai, designed specifically for evaluating character customization in large language models. With a context length of 32768 tokens, it serves as the core evaluation component for the CharacterBench framework. This model is optimized to assess how well LLMs adhere to and maintain specific character traits and personas.

Loading preview...

CharacterJudge: A Specialized LLM for Character Evaluation

CharacterJudge is a 7.6 billion parameter language model developed by thu-coai, specifically engineered to function as a judge for evaluating the character customization capabilities of other large language models. It operates within the CharacterBench framework, providing a standardized method for assessing how effectively LLMs can adopt and consistently maintain defined character traits and personas.

Key Capabilities

  • Automated Character Evaluation: Designed to objectively score and assess the performance of LLMs in character-driven scenarios.
  • High Context Understanding: Features a substantial context length of 32768 tokens, enabling it to process and evaluate complex character interactions and long-form narratives.
  • Research-Backed: The model and its application are detailed in a paper accepted to AAAI 2025, titled "CharacterBench: Benchmarking Character Customization of Large Language Models" (arXiv:2412.11912).

Good For

  • Benchmarking LLM Character Consistency: Ideal for researchers and developers needing to quantitatively measure how well LLMs can embody and sustain specific character roles.
  • Developing Character-Rich Applications: Useful for evaluating and refining LLMs intended for role-playing, interactive storytelling, or virtual assistant applications requiring strong persona adherence.
  • Academic Research: A valuable tool for studies focusing on the nuanced aspects of character generation and maintenance in generative AI.