nixiesearch/nixie-querygen-v2

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 20, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

nixiesearch/nixie-querygen-v2 is a 7 billion parameter Mistral-7B-v0.1 model fine-tuned by nixiesearch for query generation tasks, specifically designed to create synthetic queries from documents. It supports a 4096-token context length and is optimized for expanding datasets for embedding training or generating queries when only documents are available. The model is based on the docTTTTTquery approach and trained on 200k query-document pairs from diverse IR datasets.

Loading preview...

Overview

nixiesearch/nixie-querygen-v2 is a 7 billion parameter language model, fine-tuned from Mistral-7B-v0.1, specifically for generating synthetic queries. This model addresses the challenge of creating relevant queries when only document collections are available or when expanding limited query-document datasets for embedding training. It leverages the principles of the docTTTTTquery approach.

Key Capabilities

  • Synthetic Query Generation: Creates queries from documents, useful for downstream embedding fine-tuning tasks where explicit queries are scarce.
  • Dataset Expansion: Enhances existing, small query-document datasets by generating additional synthetic queries based on the provided documents.
  • Flexible Prompting: Supports optional modifiers like [short|medium|long] and [question|regular] to control query characteristics.

Training Details

The model was trained using nixietune on a dataset of 200,000 query-document pairs, sampled from a variety of Information Retrieval (IR) datasets. It supports a context length of 4096 tokens.

Deployment Options

Available in multiple formats for diverse deployment scenarios:

  • PyTorch FP16 checkpoint: Suitable for further fine-tuning.
  • GGUF F16 (non-quantized): For CPU inference with llama.cpp.
  • GGUF Q4_0 (quantized): For faster, less precise CPU inference with llama.cpp.