VMware/open-llama-0.3T-7B-open-instruct-v1.1

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kLicense:ccArchitecture:Transformer0.0K Cold

VMware/open-llama-0.3T-7B-open-instruct-v1.1 is a 7 billion parameter instruction-tuned causal language model developed by VMware. This model is based on a partially trained Open-LLaMA checkpoint (300 billion tokens) and fine-tuned with the open-instruct-v1.1 dataset, which includes OASST, Dolly, and HHRlhf data. It is designed for general instruction-following tasks, offering a commercially viable option for developers.

Loading preview...

Model Overview

VMware/open-llama-0.3T-7B-open-instruct-v1.1 is a 7 billion parameter instruction-tuned language model developed by VMware. It is built upon a partially trained Open-LLaMA checkpoint, having been trained on 300 billion tokens (0.3T tokens). The model utilizes the open-instruct-v1.1 dataset, which combines data from OASST, Dolly, and HHRlhf, and follows an Alpaca prompt template.

Key Characteristics

  • Model Size: 7 billion parameters.
  • Training Data: Instruction-tuned on VMware/open-instruct-v1-oasst-dolly-hhrlhf dataset.
  • Base Model: Derived from openlm-research/open_llama_7b_preview_300bt.
  • License: Commercially viable, with the instruction dataset under CC-BY-SA-3.0 and the language model under Apache-2.0.

Limitations

  • Partial Training: The base Open-LLaMA checkpoint was only partially trained (300 billion tokens), indicating potential for improved performance with a fully trained base model.
  • Few-Shot Prompting: The model currently struggles with few-shot prompting scenarios.
  • Code Formatting: May not consistently include code within markdown blocks and does not indent Python code.