allenai/open-instruct-sni-7b
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jun 7, 2023Architecture:Transformer Cold

The allenai/open-instruct-sni-7b is a 7 billion parameter LLaMa model developed by AllenAI, fine-tuned specifically on the Super-Natural Instructions dataset. This model is designed to enhance instruction-following capabilities across a wide range of tasks, as explored in the paper "How Far Can Camels Go?". It serves as a model diff, requiring an existing LLaMa base model for recovery and use, and is optimized for general instruction-tuned applications.

Loading preview...

Overview

allenai/open-instruct-sni-7b is a 7 billion parameter LLaMa model developed by AllenAI, fine-tuned using the Super-Natural Instructions (SNI) dataset. This model is presented as a "model diff," meaning it requires an existing LLaMa base model to be recovered and used. The training methodology and evaluation are detailed in the research paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources."

Key Capabilities & Features

  • Instruction Following: Fine-tuned on the extensive Super-Natural Instructions dataset to improve adherence to diverse prompts.
  • LLaMa Base: Built upon the LLaMa architecture, leveraging its foundational language understanding.
  • Model Diff Format: Distributed as a weight difference, requiring a recovery script and an original LLaMa model for full functionality.
  • Standardized Input: Designed to work with a specific user/assistant input format for optimal generation quality.

Performance Highlights

Evaluated across various benchmarks, the model demonstrates a general instruction-following capability. Notable scores include 44.1 on MMLU 0-shot, 47.9 on TydiQA Gold-Passage, and 7.0 Pass@1 on Codex-Eval. The average performance across the explored benchmarks is 18.3.

Usage Considerations

Users must have access to a LLaMa model in Hugging Face format. The model is recovered using a provided weight_diff.py script, which merges the diff with the base LLaMa model. Proper input formatting, including newlines after the <|assistant|> tag, is crucial for best results.