VMware/open-llama-0.3T-7B-open-instruct-v1.1
VMware/open-llama-0.3T-7B-open-instruct-v1.1 is a 7 billion parameter instruction-tuned causal language model developed by VMware. This model is based on a partially trained Open-LLaMA checkpoint (300 billion tokens) and fine-tuned with the open-instruct-v1.1 dataset, which includes OASST, Dolly, and HHRlhf data. It is designed for general instruction-following tasks, offering a commercially viable option for developers.
Loading preview...
Model Overview
VMware/open-llama-0.3T-7B-open-instruct-v1.1 is a 7 billion parameter instruction-tuned language model developed by VMware. It is built upon a partially trained Open-LLaMA checkpoint, having been trained on 300 billion tokens (0.3T tokens). The model utilizes the open-instruct-v1.1 dataset, which combines data from OASST, Dolly, and HHRlhf, and follows an Alpaca prompt template.
Key Characteristics
- Model Size: 7 billion parameters.
- Training Data: Instruction-tuned on
VMware/open-instruct-v1-oasst-dolly-hhrlhfdataset. - Base Model: Derived from
openlm-research/open_llama_7b_preview_300bt. - License: Commercially viable, with the instruction dataset under CC-BY-SA-3.0 and the language model under Apache-2.0.
Limitations
- Partial Training: The base Open-LLaMA checkpoint was only partially trained (300 billion tokens), indicating potential for improved performance with a fully trained base model.
- Few-Shot Prompting: The model currently struggles with few-shot prompting scenarios.
- Code Formatting: May not consistently include code within markdown blocks and does not indent Python code.