Name: BAAI/JudgeLM-13B-v1.0 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: BAAI

JudgeLM-13B-v1.0: A Specialized LLM Judge

JudgeLM-13B-v1.0 is a 13 billion parameter language model developed by HUST and BAAI, specifically engineered for evaluating the performance of other large language models (LLMs) and chatbots. Fine-tuned from the Vicuna-v1.3 architecture, this model leverages a unique training approach to serve as an automated judge.

Key Capabilities

LLM Evaluation: Designed to assess the quality and performance of responses generated by various large language models.
Specialized Training: Fine-tuned on approximately 200,000 judge samples from the JudgeLM-100K dataset, enhancing its ability to provide nuanced judgments.
Research Tool: Primarily intended for academic and research purposes in natural language processing and artificial intelligence.

Good For

Benchmarking LLMs: Researchers can use JudgeLM to systematically evaluate and compare different LLMs.
Chatbot Performance Assessment: Ideal for assessing the effectiveness and coherence of chatbot interactions.
Academic Research: Supports studies on LLM evaluation methodologies and the development of automated judging systems. Further details are available in the associated research paper.

Overview

JudgeLM-13B-v1.0: A Specialized LLM Judge

Key Capabilities

Good For

Full Model Card (README)