NewHope: A Code-Focused Llama-2 Fine-tune
NewHope is a 13 billion parameter instruction-tuned chat model, developed by the SLAM (SUFE Large AI Model) research group at Shanghai University of Finance and Economics. It is built upon the Llama-2 architecture and is specifically designed to excel in programming tasks.
Key Capabilities & Performance
- Multi-language Code Generation: Supports various programming languages including Python, C++, Java, JavaScript, and Go.
- High HumanEval Performance: Achieves a Pass@1 score of 66.5 on the HumanEval benchmark, closely approaching GPT-4's score of 67.0 in coding ability.
- Instruction Following: Capable of generating code based on natural language instructions and engaging in dialog-based interactions for code-related queries.
Noteworthy Information
- Evaluation Data Leak: The developers explicitly state that evaluation data has leaked into the dataset used for this model.
- Open-sourced Weights: Model weights are available on Hugging Face under the SLAM-group/NewHope repository.
- Leaderboard Performance: On the Open LLM Leaderboard, NewHope shows an average score of 51.9, with notable scores in HellaSwag (84.03) and Winogrande (74.98), while GSM8K (15.85) and DROP (26.66) are lower.
Usage
NewHope can be loaded using the Hugging Face Transformers library (version 4.31.0 or higher) and supports both single-turn instruction-based code generation and multi-turn conversational interactions.