Model Overview
umd-zhou-lab/recycled-wizardlm-7b-v2.0 is a 7 billion parameter instruction-tuned language model developed by the UMD Tianyi Zhou Lab. It is based on the Llama-2-7b architecture and employs a unique "Reflection-Tuning" method, which involves recycling WizardLM data (V2) to improve its instruction-following capabilities. This approach is detailed in the paper "Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning".
Key Capabilities & Performance
This model shows notable improvements in performance compared to its baseline. Benchmarks indicate:
- AlpacaEval: Achieves 83.48, significantly higher than the 67.64 of the original WizardLM 7B.
- Average Open LLM Leaderboard Score: Scores 56.79, with specific improvements in ARC (54.78), MMLU (45.63), and TruthfulQA (48.91).
Training Details
The model was fine-tuned using a global batch size of 128, a learning rate of 2e-5, and 3 epochs, with a maximum sequence length of 2048 tokens. The training utilized prompts from FastChat.
Good For
- Research: Ideal for researchers studying large language models, instruction-tuning, and data recycling techniques.
- Chatbot Development: Suitable for hobbyists and developers looking to experiment with improved instruction-following chatbots.
- Benchmarking: Useful for comparing performance against other 7B parameter models, especially those focused on instruction-tuning.