gzone0111/AutoGraphR1-musique_hotpotqa_train-llama3.2-3b-text-retriever-grpo-repetition-penalty

TEXT GENERATIONConcurrency Cost:1Model Size:3.2BQuant:BF16Ctx Length:32kPublished:Oct 14, 2025License:mitArchitecture:Transformer Open Weights Cold

The gzone0111/AutoGraphR1-musique_hotpotqa_train-llama3.2-3b-text-retriever-grpo-repetition-penalty model is a 3.2 billion parameter language model based on the Llama 3.2 architecture. It is specifically fine-tuned for text retrieval tasks, particularly on datasets like Musique and HotpotQA, and incorporates Grouped Repetition Penalty (GRPO) to enhance output quality. This model is designed for efficient and accurate information extraction and question answering from large text corpora, leveraging its specialized training for improved performance in these domains.

Loading preview...

Model Overview

The gzone0111/AutoGraphR1-musique_hotpotqa_train-llama3.2-3b-text-retriever-grpo-repetition-penalty is a 3.2 billion parameter language model built upon the Llama 3.2 architecture. It has been specifically fine-tuned for text retrieval tasks, demonstrating a focused application on information extraction and question answering.

Key Capabilities

  • Text Retrieval: Optimized for accurately retrieving relevant information from extensive text datasets.
  • Question Answering: Specialized training on datasets like Musique and HotpotQA enhances its ability to answer complex questions by identifying pertinent text segments.
  • Repetition Penalty: Incorporates Grouped Repetition Penalty (GRPO) to mitigate repetitive outputs, leading to more coherent and diverse responses.
  • Efficient Size: With 3.2 billion parameters, it offers a balance between performance and computational efficiency, making it suitable for deployment in various environments.

Good For

  • Information Extraction Systems: Ideal for applications requiring precise data extraction from documents or large text collections.
  • Advanced QA Systems: Suitable for building sophisticated question-answering systems that need to process and synthesize information from multiple sources.
  • Semantic Search: Can be leveraged for improving the relevance and accuracy of semantic search functionalities.
  • Research and Development: Provides a strong baseline for further research into text retrieval, question answering, and repetition control techniques in LLMs.