DeepRetrieval/DeepRetrieval-PubMed-3B-Llama

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Mar 31, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

DeepRetrieval/DeepRetrieval-PubMed-3B-Llama is a 3.2 billion parameter Llama-based model developed by DeepRetrieval, specifically trained using a novel reinforcement learning approach for query generation. This model excels at optimizing query generation for retrieval tasks without requiring supervised data, learning through trial and error with retrieval metrics as rewards. It is designed to hack real search engines and retrievers, offering state-of-the-art performance across diverse retrieval scenarios. Its primary strength lies in its ability to generate effective queries for information retrieval, making it suitable for applications requiring robust search capabilities.

Loading preview...

DeepRetrieval-PubMed-3B-Llama Overview

DeepRetrieval-PubMed-3B-Llama is a 3.2 billion parameter model that utilizes a novel reinforcement learning (RL) framework for query generation. Developed by DeepRetrieval, this model distinguishes itself by eliminating the need for expensive human-annotated or distilled reference queries, a common requirement in traditional query generation methods. Instead, it learns to optimize query generation directly for retrieval performance through trial and error, using retrieval metrics as rewards.

Key Capabilities

  • Unsupervised Query Generation: Learns to generate effective queries without relying on supervised data.
  • RL-Based Optimization: Employs reinforcement learning to directly enhance retrieval performance.
  • High Performance: Achieves strong results across various retrieval tasks.

Good For

  • Information Retrieval Systems: Enhancing the query generation component of search engines and retrieval systems.
  • Research and Development: Exploring novel reinforcement learning applications in natural language processing.
  • Cost-Effective Solutions: Reducing the need for extensive human annotation or data distillation in query generation.

For more technical details, refer to the DeepRetrieval Paper and the project's GitHub page.