The allenai/open-instruct-sharegpt-7b is a 7 billion parameter LLaMa model developed by AllenAI, fine-tuned on the cleaned ShareGPT dataset. This model is designed for instruction-following tasks, leveraging its training on diverse conversational data. It serves as a foundational instruction-tuned model, suitable for general-purpose conversational AI applications.
Loading preview...
Overview
allenai/open-instruct-sharegpt-7b is a 7 billion parameter LLaMa model developed by AllenAI. It has been fine-tuned using the ShareGPT dataset, which was cleaned similarly to the Vicuna dataset. This model is a result of research exploring the capabilities of instruction tuning on open resources, as detailed in the paper "How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources".
Key Capabilities
- Instruction Following: Fine-tuned specifically for understanding and responding to user instructions based on the ShareGPT dataset.
- Conversational AI: Optimized for generating human-like responses in a dialogue format.
- Model Recovery: Distributed as a model diff, requiring recovery using a base LLaMa model and a provided script.
- Specific Input Format: Designed to work with a
<|user|>and<|assistant|>turn-based format for optimal performance.
Performance Highlights
Evaluated across various benchmarks, the model achieved:
- MMLU (0-shot): 44.3
- MMLU (5-shot): 40.0
- GSM Direct: 8.0
- BBH CoT: 32.6
- Codex-Eval Pass@1: 10.9
- AlpacaFarm vs Davinci-003: 58.3
Good for
- Developers looking for an instruction-tuned LLaMa-based model for general conversational tasks.
- Research into instruction tuning and large language model fine-tuning.
- Applications requiring a model capable of following structured prompts.