Model Overview
RankVicuna is a 7 billion parameter auto-regressive language model developed by Castorini, fine-tuned from lmsys/vicuna-7b-v1.5. It is built upon the Llama 2 transformer architecture and utilizes user-shared conversations from ShareGPT for its training, incorporating data augmentation techniques. The model is primarily intended for research purposes, focusing on the intersection of large language models and retrieval systems.
Key Capabilities
- Specialized Fine-tuning: Instruction fine-tuned from Vicuna-7B-v1.5, which itself is based on Llama 2.
- Research Focus: Designed specifically for research in natural language processing and information retrieval, particularly for ranking applications.
- Data Augmentation: Training incorporates data augmentation to enhance its performance.
Good For
- Information Retrieval Research: Ideal for researchers and hobbyists exploring how LLMs can be applied to and improve retrieval tasks.
- Ranking Applications: Suited for experiments and development related to ranking documents or responses.
- Academic Study: A valuable tool for studying the effects of instruction fine-tuning and data augmentation on Llama 2-based models in specific domains.
Evaluation
The model has been evaluated on DL19/DL20 datasets, with further details available in the associated paper.