Overview
PKU-Alignment/alpaca-7b-reproduced: An Instruction-Following LLaMA Model
This model is a 7 billion parameter instruction-following language model developed by the PKU-Alignment Team. It is a reproduced version of the original Stanford Alpaca model, built upon the LLaMA foundation model.
Key Characteristics & Differences
- Foundation Model: Fine-tuned from Meta's LLaMA architecture.
- Reproduction: This is a re-implementation of the Stanford Alpaca model, not the original.
- Training Backend: Utilizes the DeepSpeed library for training, differing from the original's PyTorch FSDP.
- Conversation Template: Employs a distinct conversation template compared to the original Stanford Alpaca.
- License: Distributed under a non-commercial license.
Usage & Evaluation
Users can interact with the model via the PKU-Alignment/safe-rlhf library's interactive CLI demo or through the Hugging Face Transformers library. Evaluation results for this model are detailed in the associated paper: https://arxiv.org/abs/2310.12773.
Good for
- Experimenting with instruction-following capabilities based on the LLaMA architecture.
- Research into different training methodologies (DeepSpeed) for instruction-tuned models.
- Understanding variations in Alpaca model implementations and their conversational behaviors.