BAAI/JudgeLM-13B-v1.0

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Oct 27, 2023Architecture:Transformer0.0K Cold

BAAI/JudgeLM-13B-v1.0 is a 13 billion parameter auto-regressive language model developed by HUST and BAAI, fine-tuned from Vicuna-v1.3. This model is specifically designed as a judge model, trained on the JudgeLM-100K dataset to evaluate the performance of large language models and chatbots. Its primary application is in research for assessing LLM outputs, offering a specialized tool for NLP and AI researchers.

Loading preview...

JudgeLM-13B-v1.0: A Specialized LLM Judge

JudgeLM-13B-v1.0 is a 13 billion parameter language model developed by HUST and BAAI, specifically engineered for evaluating the performance of other large language models (LLMs) and chatbots. Fine-tuned from the Vicuna-v1.3 architecture, this model leverages a unique training approach to serve as an automated judge.

Key Capabilities

  • LLM Evaluation: Designed to assess the quality and performance of responses generated by various large language models.
  • Specialized Training: Fine-tuned on approximately 200,000 judge samples from the JudgeLM-100K dataset, enhancing its ability to provide nuanced judgments.
  • Research Tool: Primarily intended for academic and research purposes in natural language processing and artificial intelligence.

Good For

  • Benchmarking LLMs: Researchers can use JudgeLM to systematically evaluate and compare different LLMs.
  • Chatbot Performance Assessment: Ideal for assessing the effectiveness and coherence of chatbot interactions.
  • Academic Research: Supports studies on LLM evaluation methodologies and the development of automated judging systems. Further details are available in the associated research paper.