Overview
This model, allenai/open-instruct-flan-v2-13b, is a 13 billion parameter LLaMa model fine-tuned by AllenAI using the Flan V2 dataset. It was developed as part of the research presented in the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources." The model is distributed as a weight difference, meaning users need to recover the full model using an existing LLaMa base model and a provided script.
Key Capabilities
- Instruction Following: Enhanced through fine-tuning on the comprehensive Flan V2 dataset.
- General-Purpose Language Tasks: Capable of handling a broad spectrum of natural language understanding and generation tasks.
- Benchmark Performance: Achieves a 25.1 average score across various benchmarks including MMLU (51.2 5-shot), GSM CoT (21.0), BBH CoT (39.2), and Codex-Eval Pass@10 (16.2).
Usage and Input Format
To use this model, users must first recover it from the provided weight diff using the weight_diff.py script from the allenai/open-instruct repository. The model expects inputs formatted with specific user and assistant tags:
<|user|>
Your message here!
<|assistant|>
It is crucial to include a newline after <|assistant|> for optimal generation quality.
Good for
- Researchers and developers looking for an instruction-tuned LLaMa model.
- Applications requiring robust instruction-following capabilities.
- Experimentation with models fine-tuned on the Flan V2 dataset.