VMware/open-llama-7b-open-instruct

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Jun 8, 2023License:cc-by-sa-3.0Architecture:Transformer0.0K Open Weights Cold

The VMware/open-llama-7b-open-instruct is a 7 billion parameter instruction-tuned causal language model developed by VMware, based on the Open Llama architecture. This model is optimized for general instruction following and is commercially viable. It was fine-tuned using a diverse dataset including OASST, Dolly, and HHRlhf, making it suitable for a wide range of natural language processing tasks.

Loading preview...

What is VMware/open-llama-7b-open-instruct?

This model is an instruction-tuned variant of the 7 billion parameter Open Llama model, developed by VMware. It is designed for general-purpose instruction following and is explicitly made available for commercial use, distinguishing it from many other models with restrictive licenses. The model was fine-tuned using a composite dataset, VMware/open-instruct-v1-oasst-dolly-hhrlhf, which combines data from OpenAssistant, Dolly, and HHRlhf.

Key Characteristics

  • Model Architecture: Based on the Open Llama 7B model.
  • Parameter Count: 7 billion parameters.
  • Commercial Viability: Licensed for commercial use, offering flexibility for developers and businesses.
  • Instruction Tuning: Utilizes the Alpaca prompt template for instruction-following tasks.
  • Tokenizer Note: Requires use_fast = False when instantiating the tokenizer to avoid incorrect encoding.

Performance Highlights

Evaluations on the Open LLM Leaderboard show an average score of 40.9. Specific benchmark results include:

  • ARC (25-shot): 49.74
  • HellaSwag (10-shot): 73.67
  • MMLU (5-shot): 31.52
  • TruthfulQA (0-shot): 34.65

Use Cases

This model is well-suited for applications requiring a 7B parameter model that can follow instructions effectively, particularly in commercial settings due to its permissive license. It can be used for various NLP tasks such as text generation, summarization, and question answering, leveraging its instruction-tuned capabilities.