Overview
quicktensor/blockrank-msmarco-mistral-7b is a 7 billion parameter model, fine-tuned from Mistral-7B-Instruct-v0.3, specifically designed for scalable in-context document ranking. Developed by Nilesh Gupta and his team, this model integrates the BlockRank method to enhance efficiency and performance in ranking tasks.
Key Capabilities
- Efficient In-context Ranking: Optimized for ranking documents within the model's context window.
- Linear Complexity Attention: Utilizes structured sparse attention to reduce computational complexity from O(n²) to O(n), making it highly scalable.
- Faster Inference: Achieves 2-4 times faster inference speeds for ranking by eliminating the need for autoregressive decoding.
- Improved Relevance Signals: Incorporates an auxiliary contrastive loss at mid-layers to strengthen relevance signals.
- Strong Zero-shot Generalization: Demonstrates state-of-the-art performance on BEIR benchmarks without specific in-domain training.
Good For
This model is ideal for applications requiring efficient and accurate document ranking, particularly in scenarios where large language models are used for retrieval-augmented generation or information retrieval. Its optimized architecture makes it suitable for scalable ranking tasks where speed and computational efficiency are critical.