Alibaba-NLP/ERank-32B

Hugging Face
TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Aug 26, 2025License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

Alibaba-NLP/ERank-32B is a 32 billion parameter pointwise text reranker developed by Alibaba-NLP, designed for effective and efficient document relevance scoring. This model utilizes a novel two-stage training pipeline combining Supervised Fine-Tuning (SFT) for generative integer score output and Reinforcement Learning (RL) with a listwise derived reward. ERank-32B excels in reasoning-intensive reranking tasks, outperforming many listwise rerankers while maintaining low latency due to its pointwise architecture. It supports custom input instructions and has a context length of 128K tokens.

Loading preview...

ERank-32B: An Effective and Efficient Text Reranker

ERank-32B, developed by Alibaba-NLP, is a 32 billion parameter pointwise reranker built from a reasoning LLM, designed for high effectiveness and low latency across diverse relevance scenarios. It notably outperforms recent listwise rerankers on challenging reasoning-intensive tasks.

Key Capabilities and Training

  • Novel Two-Stage Training: ERank is trained using a unique pipeline:
    • Supervised Fine-Tuning (SFT): Unlike traditional rerankers, ERank is trained to generatively output fine-grained integer scores for relevance.
    • Reinforcement Learning (RL): Incorporates a novel listwise derived reward to instill global ranking awareness into its efficient pointwise architecture.
  • Instruction Awareness: The model supports customizing input instructions for different tasks.
  • High Context Length: Features a substantial 128K token sequence length.

Performance and Efficiency

ERank-32B demonstrates strong performance across various benchmarks:

  • Reasoning-Intensive Tasks: Achieves an average score of 38.1 across BRIGHT, FollowIR, BEIR, and TREC DL benchmarks, with 24.4 on BRIGHT and 12.1 on FollowIR, surpassing other pointwise and many listwise methods.
  • State-of-the-Art on BRIGHT: When combined with BM25 hybrid, ERank-32B achieves a state-of-the-art nDCG@10 of 40.2 on the challenging BRIGHT benchmark.
  • Low Latency: As a pointwise reranker, ERank offers significantly lower latency compared to listwise models, making it highly efficient for real-time applications.

Usage

ERank-32B can be easily integrated using Transformer or vLLM for inference, allowing users to rerank documents based on a query and instruction, with optional hybrid scoring with first-stage retriever scores.