Overview
Model Overview
microsoft/phi-1_5 is a compact, 1.3 billion parameter Transformer model developed by Microsoft. It builds upon the training data of its predecessor, phi-1, by incorporating additional NLP synthetic texts. This model achieves near state-of-the-art performance on benchmarks for common sense, language understanding, and logical reasoning within the sub-10 billion parameter category.
Key Characteristics
- Research-Oriented: Released as an open-source model to facilitate research into critical AI safety challenges, such as toxicity reduction, bias understanding, and controllability.
- Curated Training Data: Excludes generic web-crawl data like Common Crawl to mitigate direct exposure to potentially harmful online content, enhancing safety without relying on RLHF.
- Versatile Generation: Capable of generating poems, drafting emails, creating stories, summarizing texts, and writing Python code.
- Base Model: Not fine-tuned for instruction following or reinforcement learning from human feedback, meaning it may produce irrelevant text following main answers and struggle with complex instructions.
Intended Uses
Phi-1.5 is best suited for prompts formatted as:
- QA Format: Generating answers to questions.
- Chat Format: Participating in multi-turn conversations.
- Code Format: Completing or generating code snippets, particularly in Python.
Users should treat generated text and code as starting points, as the model can produce inaccurate outputs and has limitations regarding language comprehension beyond standard English and potential societal biases.