allenai/open-instruct-human-mix-13b
The allenai/open-instruct-human-mix-13b is a 13 billion parameter LLaMa model developed by AllenAI, fine-tuned on a diverse mixture of human-authored datasets including FLAN V2, CoT, Dolly, and Open Assistant 1. This instruction-tuned model is designed to enhance general conversational and reasoning capabilities, leveraging a broad spectrum of human-generated instructions. It is particularly suited for tasks requiring robust instruction following and general language understanding, as demonstrated by its performance across various benchmarks.
Loading preview...
Model Overview
The allenai/open-instruct-human-mix-13b is a 13 billion parameter LLaMa model developed by AllenAI, specifically instruction-tuned using a blend of human-authored datasets. These datasets include FLAN V2, CoT, Dolly, and Open Assistant 1, aiming to imbue the model with strong instruction-following and general conversational abilities. This model is released as a "model diff," requiring users to apply it to an existing LLaMa base model.
Key Capabilities & Performance
This model demonstrates general-purpose instruction following, benefiting from its diverse training data. Its performance across various benchmarks, as detailed in the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources," includes:
- MMLU (5-shot): 51.2
- GSM CoT: 36.5
- BBH CoT: 39.4
- TydiQA Gold-Passage: 49.8
- Codex-Eval Pass@1: 11.3
- AlpacaFarm vs Davinci-003: 36.3
Usage and Input Format
To use this model, users must first have access to a LLaMa model in Hugging Face format. The open-instruct codebase provides scripts to recover the full model from the provided diff. The model expects a specific input format for optimal generation quality:
<|user|>
Your message here!
<|assistant|>It is crucial to include a newline after <|assistant|> to ensure the best generation results.