Overview
allenai/open-instruct-sni-13b is a 13 billion parameter LLaMa-based model developed by Allen Institute for AI. It has been fine-tuned using the Super-Natural Instructions dataset, which focuses on improving the model's ability to understand and execute a wide range of instructions. This model is released as a weight difference (diff) and requires a pre-existing LLaMa model in Hugging Face format for full recovery and usage.
Key Capabilities & Features
- Instruction Following: Enhanced through fine-tuning on the Super-Natural Instructions dataset.
- LLaMa Architecture: Built upon the LLaMa foundation model, leveraging its robust language understanding.
- Model Recovery: Utilizes a
weight_diff.py script to combine the provided diff with a base LLaMa model. - Specific Input Format: Designed to work optimally with a
<|user|> Your message here! <|assistant|> input structure, with a critical newline after <|assistant|>.
Performance Highlights
Based on the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources", the model achieves notable scores across various benchmarks:
- MMLU (5-shot): 50.8
- Codex-Eval Pass@1: 8.2
- TydiQA Gold-Passage: 51.4
When to Use This Model
This model is suitable for developers looking for an instruction-tuned LLaMa-based model for general natural language understanding and generation tasks. Its fine-tuning on a diverse instruction dataset makes it a strong candidate for applications requiring robust instruction following. Users should be prepared to perform the model recovery process using a base LLaMa model.