allenai/open-instruct-self-instruct-13b

TEXT GENERATIONConcurrency Cost:1Model Size:13BQuant:FP8Ctx Length:4kPublished:Jun 7, 2023Architecture:Transformer Cold

The allenai/open-instruct-self-instruct-13b is a 13 billion parameter LLaMa model developed by AllenAI, fine-tuned on the Self-instruct dataset. This model is designed for instruction-following tasks, leveraging a self-generated instruction approach to enhance its capabilities. It is particularly suited for general-purpose conversational AI and instruction-based text generation, offering a robust foundation for various NLP applications.

Loading preview...

Overview

The allenai/open-instruct-self-instruct-13b is a 13 billion parameter LLaMa model developed by AllenAI. It has been fine-tuned using the Self-instruct dataset, a method where the model generates its own instructions to improve its instruction-following abilities. This model was developed as part of the research presented in the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources" (arXiv:2306.04751).

Key Capabilities

  • Instruction Following: Enhanced through the Self-instruct methodology, allowing it to respond effectively to diverse prompts.
  • General-Purpose Text Generation: Capable of generating coherent and contextually relevant text based on user instructions.
  • Benchmark Performance: Achieves an average score of 18.7 across a suite of benchmarks including MMLU, GSM, BBH, TydiQA, and Codex-Eval, as detailed in the associated research paper.

Usage and Integration

This model is distributed as a model diff, requiring users to recover the full model from an existing LLaMa base model using a provided script. It expects inputs formatted with specific <|user|> and <|assistant|> tokens, with a crucial newline after <|assistant|> for optimal generation quality.

Good for

  • Research in Instruction Tuning: Ideal for researchers exploring instruction-following capabilities and self-supervised learning methods.
  • Developing Conversational Agents: Suitable for building chatbots and interactive AI systems that require robust instruction adherence.
  • General NLP Tasks: Can be adapted for various text generation and understanding tasks where instruction-based interaction is beneficial.