hkust-nlp/deita-complexity-scorer
The hkust-nlp/deita-complexity-scorer is a 7 billion parameter model developed by HKUST NLP, fine-tuned from Llama-1-13b-hf, designed to automatically annotate the instruction complexity of Supervised Fine-Tuning (SFT) data. This model specializes in providing a numerical complexity score for user queries, making it a crucial tool for automatic data selection in Large Language Model instruction tuning. Its primary use case is to streamline the process of curating high-quality datasets by identifying and scoring the complexity of instructions.
Loading preview...
Deita Complexity Scorer: Automatic Instruction Complexity Annotation
The Deita Complexity Scorer, developed by HKUST NLP, is a specialized 7 billion parameter model fine-tuned from Llama-1-13b-hf. It is designed to automatically annotate the instruction complexity of Supervised Fine-Tuning (SFT) data, playing a key role in the Deita project's goal of facilitating automatic data selection for Large Language Models (LLMs).
Key Capabilities
- Automated Complexity Scoring: Assigns a numerical complexity score (1-6) to user instructions.
- Data Selection Enhancement: Helps in curating high-quality SFT datasets by identifying and filtering instructions based on their complexity.
- Fine-tuned Performance: Leverages a Llama-1-13b-hf base model, specifically adapted for this annotation task.
Good for
- Researchers and developers working on instruction tuning for LLMs.
- Automating the selection and filtering of training data based on instruction complexity.
- Improving the efficiency and quality of SFT dataset creation.
- Analyzing the complexity distribution of existing instruction datasets.