Phi-2 Orange Overview
Phi-2 Orange is a 3 billion parameter language model developed by rhysjones, built upon Microsoft's Phi-2 architecture. It undergoes a two-step fine-tuning process to enhance its capabilities. The initial fine-tune leverages a diverse collection of datasets including Open-Orca/SlimOrca-Dedup, migtissera/Synthia-v1.3, and several others focused on reasoning and instruction following. This is followed by a DPO (Direct Preference Optimization) fine-tune using datasets like Intel/orca_dpo_pairs and argilla/ultrafeedback-binarized-preferences-cleaned.
Key Capabilities & Performance
- Enhanced Reasoning: Achieves a score of 33.37 on AGIEval, outperforming the base Phi-2 model.
- General Knowledge: Scores 49.87 on TruthfulQA and 37.3 on Bigbench, indicating strong general understanding.
- Instruction Following: The DPO fine-tuning aims to improve alignment and response quality.
- Compact Size: At 3 billion parameters, it offers a balance of performance and efficiency, suitable for deployment in resource-constrained environments.
Usage & Prompt Format
Phi-2 Orange utilizes the ChatML prompt format, supporting both system instructions and user prompts. It can be run locally using Ollama with a simple ollama run rhysjones/phi-2-orange command. An updated version, rhysjones/phi-2-orange-v2, is also available with higher evaluation scores.