akoksal/LongForm-LLaMA-7B-diff

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 26, 2023Architecture:Transformer0.0K Cold

The akoksal/LongForm-LLaMA-7B-diff model is a 7 billion parameter LLaMA-based language model fine-tuned by Abdullatif Köksal for long-form text generation. It is distributed as a diff to the original LLaMA-7B model due to licensing constraints. This model excels in tasks like recipe generation, long-form question answering, and story generation, outperforming prior instruction-tuned models on out-of-domain datasets. Its primary use case is generating extended, coherent text based on instructions.

Loading preview...

LongForm-LLaMA-7B-diff Overview

LongForm-LLaMA-7B-diff is a 7 billion parameter LLaMA-based model developed by Abdullatif Köksal, specifically fine-tuned for long-form text generation. This model is released as a diff (difference) to the original LLaMA-7B model, requiring users to apply the provided weights to a pre-existing LLaMA-7B installation due to licensing restrictions. It leverages the unique LongForm dataset, which is constructed from diverse human-written documents and extended with structured examples and various task types like question answering and story generation.

Key Capabilities

  • Superior Long Text Generation: Outperforms other instruction-tuned models in generating extended, coherent text across various domains.
  • Instruction Following: Designed to follow instructions effectively for tasks requiring detailed outputs.
  • Diverse Task Performance: Demonstrates strong performance in:
    • Recipe Generation (RGen)
    • Long-form Question Answering (ELI5)
    • Short Story Generation (WritingPrompts/WP)

Good for

  • Developers needing a model optimized for generating lengthy, detailed responses.
  • Applications requiring creative writing, such as story or poem generation.
  • Systems that benefit from comprehensive answers to complex questions.
  • Tasks involving text summarization or email writing where extended output is desired.

Performance Highlights

LongForm-LLaMA-7B achieves a METEOR score of 19.7 across all evaluated out-of-domain datasets, significantly surpassing models like Alpaca-LLaMA-7B (14.6) and Flan-T5 (10.6). It particularly excels in Recipe Generation (21.7 METEOR) and ELI5 (18.6 METEOR), demonstrating its specialization in producing high-quality long-form content. For more details, refer to the LongForm GitHub repository and the associated research paper.