allenai/open-instruct-stanford-alpaca-7b

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jun 7, 2023Architecture:Transformer0.0K Cold

The allenai/open-instruct-stanford-alpaca-7b is a 7 billion parameter LLaMa model, fine-tuned by AllenAI on the Stanford Alpaca dataset. This model is a diff, requiring recovery with an existing LLaMa model, and is designed for instruction-following tasks. It was developed as part of research exploring instruction tuning on open resources, making it suitable for general-purpose conversational AI applications.

Loading preview...

Overview

The allenai/open-instruct-stanford-alpaca-7b is a 7 billion parameter LLaMa model developed by AllenAI. It has been fine-tuned specifically on the Stanford Alpaca dataset to enhance its instruction-following capabilities. This model is released as a "model diff," meaning users need to apply it to an existing LLaMa base model to recover the full instruction-tuned model. The training methodology and evaluation are detailed in the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources."

Key Capabilities

  • Instruction Following: Optimized for understanding and executing user instructions based on the Stanford Alpaca dataset.
  • General-Purpose Text Generation: Capable of generating coherent and contextually relevant text in response to prompts.
  • Research-Oriented: Developed as part of a research initiative to explore instruction tuning, providing a valuable resource for further study in this area.

Performance Highlights

Evaluated across various benchmarks, the model achieved:

  • MMLU (0-shot): 41.5
  • MMLU (5-shot): 40.3
  • GSM Direct: 7.0
  • BBH Direct: 32.6
  • Codex-Eval Pass@1: 13.2
  • AlpacaFarm vs Davinci-003: 21.1

Usage Notes

To use this model, users must have access to a LLaMa model in Hugging Face format. The provided weight_diff.py script from the allenai/open-instruct repository is used to recover the full model from the diff. Inputs should be formatted as <|user|> Your message here! <|assistant|> for optimal performance.