Model Overview
VMware/open-llama-0.3T-7B-instruct-dolly-hhrlhf is a 7 billion parameter instruction-tuned language model. It is based on a partially trained Open-LLaMA checkpoint (300 billion tokens) and further fine-tuned using the mosaicml/dolly_hhrlhf instruction dataset. This model is designed for general instruction-following and is notable for being fully open-source and commercially viable, with its components released under Apache-2.0 and CC-BY-SA-3.0 licenses respectively.
Key Features
- Instruction-Tuned: Optimized for understanding and responding to user instructions, leveraging the
dolly_hhrlhf dataset. - Open-Source & Commercial Use: Both the underlying language model and the instruction dataset are available under permissive licenses, allowing for broad commercial and research applications.
- Context Length: Supports a context window of 4096 tokens, enabling processing of moderately long inputs.
Usage Considerations
When using this model with the Hugging Face Transformers library, it is crucial to load the tokenizer with add_bos_token = True as the model was trained with a Beginning-Of-Sentence (BOS) token. An example of generating a response to a prompt like "how do I bake a cake?" is provided in the model's documentation.
Limitations
One known drawback is that the model was trained on a partially trained Open-LLaMA checkpoint, specifically one that had processed 300 billion tokens, which might impact its overall performance compared to models trained on more extensive datasets.