nixiesearch/nixie-querygen-v3

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jul 3, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

The nixiesearch/nixie-querygen-v3 is a 7 billion parameter Mistral-7B-v0.3 model fine-tuned by nixiesearch specifically for query generation tasks. It excels at creating synthetic queries from documents, which is crucial for downstream embedding fine-tuning when only documents are available or for expanding existing query-document datasets. This model is optimized for improving search relevance and retrieval systems by generating relevant search queries.

Loading preview...

nixie-querygen-v3: Specialized Query Generation Model

nixiesearch/nixie-querygen-v3 is a 7 billion parameter model based on Mistral-7B-v0.3, specifically fine-tuned for generating synthetic search queries from documents. This model is designed to address common challenges in information retrieval and embedding training, particularly when query-document pairs are scarce.

Key Capabilities

  • Synthetic Query Generation: Generates queries for documents when no existing queries or labels are available, facilitating downstream embedding fine-tuning. This process can be integrated with the nixietune toolkit.
  • Dataset Expansion: Enhances existing, limited query-document datasets by generating additional synthetic queries based on real ones, further improving embedding training.
  • Alpaca Prompt Format: Accepts a flexible Alpaca prompt format, allowing users to specify query characteristics like length (short, medium, long) and type (question, regular).

Good For

  • Embedding Fine-tuning: Ideal for scenarios where only document collections are available, enabling the creation of synthetic queries necessary for training robust document embeddings.
  • Low-Resource Scenarios: Useful for expanding small, existing query-document datasets to improve the performance of search and retrieval systems.
  • CPU Inference: Available in GGUF formats (F16 and Q4_0 quantized) for efficient inference on CPU using llama-cpp, alongside PyTorch FP16 checkpoints for GPU-based fine-tuning and inference.