voidism/SelfCite-8B-from-CC
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:32kPublished:Feb 7, 2025License:llama3.1Architecture:Transformer0.0K Cold

voidism/SelfCite-8B-from-CC is an 8 billion parameter language model developed by researchers from Massachusetts Institute of Technology and Meta AI. This model is a reproduction of the SelfCite 8B SimPO fine-tuned model, initialized from Llama-3.1-8B-Instruct and further trained with SFT data from ContextCite. It specializes in self-supervised alignment for context attribution in large language models, aiming to improve the accuracy of citations and factual grounding. With a 32768 token context length, it is designed for tasks requiring precise source referencing.

Loading preview...

SelfCite: Self-Supervised Alignment for Context Attribution

This model, voidism/SelfCite-8B-from-CC, is an 8 billion parameter language model developed by researchers from MIT and Meta AI. It is a reproduction of the SelfCite 8B SimPO fine-tuned model, specifically designed to enhance context attribution in large language models through self-supervised alignment. The model is initialized from Llama-3.1-8B-Instruct and further fine-tuned using SFT data from the ContextCite dataset, focusing on a fully self-supervised training approach as detailed in the SelfCite paper.

Key Capabilities

  • Improved Context Attribution: Specializes in accurately linking generated text to its source context.
  • Self-Supervised Alignment: Utilizes a novel self-supervised training methodology for enhanced factual grounding.
  • Llama-3.1-8B-Instruct Base: Benefits from the strong foundational capabilities of its base model.
  • Long Context Handling: Supports a context length of 32768 tokens, suitable for tasks requiring extensive contextual understanding.

Good For

  • Applications requiring high-fidelity citation and source attribution.
  • Research into self-supervised learning for factual consistency in LLMs.
  • Tasks where precise referencing of input context is critical to prevent hallucination.