Overview
jacopo-minniti/Qwen2.5-14B-llm-as-judge is a 14.8 billion parameter language model, fine-tuned from the base Qwen/Qwen2.5-14B-Instruct model. It was developed using the TRL (Transformer Reinforcement Learning) library, indicating a focus on enhancing its capabilities through supervised fine-tuning (SFT).
Key Capabilities
- LLM-as-a-Judge Functionality: This model is specifically designed and fine-tuned to act as an automated judge for evaluating the outputs of other large language models. This makes it suitable for tasks like quality assessment, response comparison, and automated feedback generation.
- Base Model Strength: Built upon the Qwen2.5-14B-Instruct architecture, it inherits strong general language understanding and generation capabilities.
- Context Length: Features a substantial context window of 32768 tokens, allowing it to process and evaluate longer texts or complex interactions when acting as a judge.
Training Details
The model underwent a supervised fine-tuning (SFT) process. The training utilized specific versions of key frameworks:
- TRL: 0.19.0
- Transformers: 4.53.1
- Pytorch: 2.1.0+cu118
- Datasets: 3.6.0
- Tokenizers: 0.21.2
Use Cases
- Automated evaluation of LLM responses.
- Benchmarking and comparing different language models.
- Providing automated feedback on generated text quality.