LongForm-LLaMA-7B-diff Overview
LongForm-LLaMA-7B-diff is a 7 billion parameter LLaMA-based model developed by Abdullatif Köksal, specifically fine-tuned for long-form text generation. This model is released as a diff (difference) to the original LLaMA-7B model, requiring users to apply the provided weights to a pre-existing LLaMA-7B installation due to licensing restrictions. It leverages the unique LongForm dataset, which is constructed from diverse human-written documents and extended with structured examples and various task types like question answering and story generation.
Key Capabilities
- Superior Long Text Generation: Outperforms other instruction-tuned models in generating extended, coherent text across various domains.
- Instruction Following: Designed to follow instructions effectively for tasks requiring detailed outputs.
- Diverse Task Performance: Demonstrates strong performance in:
- Recipe Generation (RGen)
- Long-form Question Answering (ELI5)
- Short Story Generation (WritingPrompts/WP)
Good for
- Developers needing a model optimized for generating lengthy, detailed responses.
- Applications requiring creative writing, such as story or poem generation.
- Systems that benefit from comprehensive answers to complex questions.
- Tasks involving text summarization or email writing where extended output is desired.
Performance Highlights
LongForm-LLaMA-7B achieves a METEOR score of 19.7 across all evaluated out-of-domain datasets, significantly surpassing models like Alpaca-LLaMA-7B (14.6) and Flan-T5 (10.6). It particularly excels in Recipe Generation (21.7 METEOR) and ELI5 (18.6 METEOR), demonstrating its specialization in producing high-quality long-form content. For more details, refer to the LongForm GitHub repository and the associated research paper.