voidism/SelfCite-8B-from-CC is an 8 billion parameter language model developed by researchers from Massachusetts Institute of Technology and Meta AI. This model is a reproduction of the SelfCite 8B SimPO fine-tuned model, initialized from Llama-3.1-8B-Instruct and further trained with SFT data from ContextCite. It specializes in self-supervised alignment for context attribution in large language models, aiming to improve the accuracy of citations and factual grounding. With a 32768 token context length, it is designed for tasks requiring precise source referencing.
Loading preview...
SelfCite: Self-Supervised Alignment for Context Attribution
This model, voidism/SelfCite-8B-from-CC, is an 8 billion parameter language model developed by researchers from MIT and Meta AI. It is a reproduction of the SelfCite 8B SimPO fine-tuned model, specifically designed to enhance context attribution in large language models through self-supervised alignment. The model is initialized from Llama-3.1-8B-Instruct and further fine-tuned using SFT data from the ContextCite dataset, focusing on a fully self-supervised training approach as detailed in the SelfCite paper.
Key Capabilities
- Improved Context Attribution: Specializes in accurately linking generated text to its source context.
- Self-Supervised Alignment: Utilizes a novel self-supervised training methodology for enhanced factual grounding.
- Llama-3.1-8B-Instruct Base: Benefits from the strong foundational capabilities of its base model.
- Long Context Handling: Supports a context length of 32768 tokens, suitable for tasks requiring extensive contextual understanding.
Good For
- Applications requiring high-fidelity citation and source attribution.
- Research into self-supervised learning for factual consistency in LLMs.
- Tasks where precise referencing of input context is critical to prevent hallucination.