Model Overview
umd-zhou-lab/recycled-alpaca-7b-v1.0 is a 7 billion parameter instruction-tuned language model developed by the UMD Tianyi Zhou Lab. It is based on the Llama-2-7b architecture and utilizes a unique "recycled alpaca data V1" dataset for fine-tuning, as detailed in their paper, "Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning." This approach aims to enhance instruction-following capabilities through efficient data utilization.
Key Capabilities & Performance
This model demonstrates notable performance gains over its base Llama-2-7b model, particularly in instruction-following tasks. Key performance metrics include:
- AlpacaEval: Achieves 76.99, a substantial improvement over the 26.46 scored by the original Alpaca 7B.
- MMLU: Scores 47.55, outperforming the 41.73 of Alpaca 7B.
- Overall Average: Shows an average score of 56.18 across various benchmarks, including ARC and HellaSwag.
Training Details
The model was trained with a global batch size of 128, a learning rate of 2e-5, and 3 epochs, using a maximum sequence length of 2048. The training utilized prompts from FastChat. The recycled Alpaca data used for fine-tuning is publicly available.
Intended Use Cases
- Research: Ideal for researchers studying large language models, instruction-tuning, and data efficiency.
- Chatbot Development: Suitable for hobbyists and researchers exploring advanced chatbot functionalities.
Citation
If you use this model, please cite the associated paper: Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning.