Model Overview
This model, allenai/open-instruct-self-instruct-7b, is a 7 billion parameter LLaMa-based instruction-tuned model developed by AllenAI. It was fine-tuned using the Self-instruct dataset as part of the research presented in the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources". Notably, this release is a model diff, meaning users must recover the full model by applying this diff to an existing LLaMa model in Hugging Face format.
Key Capabilities & Performance
The model is designed for general instruction-following and has been evaluated across a range of benchmarks:
- MMLU (0-shot/5-shot): Achieves 35.7% and 33.2% respectively.
- GSM (Direct/CoT): Scores 4.0% and 6.5%.
- BBH (Direct/CoT): Reaches 29.9% and 29.2%.
- TydiQA (Gold-Passage/Closed-book): Shows 35.4% and 8.7%.
- Codex-Eval (Pass@1/Pass@10): Attains 6.2% and 12.1%.
- AlpacaFarm vs Davinci-003: Scores 7.5%.
Its average performance across these benchmarks is 18.0%. The model expects a specific input format: <|user|> Your message here! <|assistant|> for optimal generation quality.
Usage Notes
To use this model, users need access to an original LLaMa model. The provided weight_diff.py script from the open-instruct GitHub repository is used to recover the full model from the diff. This process requires a decent amount of RAM, especially for larger models.