allenai/open-instruct-flan-v2-7b
The allenai/open-instruct-flan-v2-7b is a 7 billion parameter LLaMA-based language model developed by AllenAI. It is fine-tuned on the Flan V2 dataset, enhancing its instruction-following capabilities. This model is distributed as a weight diff, requiring an existing LLaMA model for recovery, and is designed for general instruction-tuned applications.
Loading preview...
Overview
The allenai/open-instruct-flan-v2-7b is a 7 billion parameter language model built upon the LLaMA architecture. Developed by AllenAI, this model has been fine-tuned using the comprehensive Flan V2 dataset, which significantly improves its ability to follow instructions and perform a wide range of natural language tasks. It was created as part of the research detailed in the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources."
Key Characteristics
- LLaMA Base Model: Leverages the robust LLaMA foundation for strong language understanding and generation.
- Flan V2 Fine-tuning: Enhanced instruction-following and generalization across diverse tasks due to training on the extensive Flan V2 dataset.
- Distributed as a Weight Diff: Requires users to have access to an original LLaMA model in Hugging Face format to recover the full model, optimizing distribution size.
- Specific Input Format: Designed to be used with a
<|user|>and<|assistant|>turn-based format for optimal performance.
Performance Highlights
Evaluated across various benchmarks, the model demonstrates capabilities in:
- Reasoning: Achieves 47.1 on MMLU 5-shot and 36.1 on BBH CoT.
- Mathematical Reasoning: Scores 13.0 on GSM CoT.
- Question Answering: Attains 45.0 on TydiQA Gold-Passage.
- Code Generation: Shows 9.6 Pass@1 and 12.9 Pass@10 on Codex-Eval.
Usage Considerations
Users must follow a specific recovery process using a provided script and an existing LLaMA model. The model is licensed under the AI model license provided, alongside the original LLaMA license.