microsoft/phi-1_5
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.4BQuant:BF16Ctx Length:2kPublished:Sep 10, 2023License:mitArchitecture:Transformer1.4K Open Weights Warm

microsoft/phi-1_5 is a 1.3 billion parameter Transformer-based language model developed by Microsoft. Trained on a curated dataset including NLP synthetic texts, it demonstrates strong performance in common sense, language understanding, and logical reasoning among small models. This base model is designed for research into AI safety challenges, offering capabilities in text generation, summarization, and Python code creation.

Loading preview...

Model Overview

microsoft/phi-1_5 is a compact, 1.3 billion parameter Transformer model developed by Microsoft. It builds upon the training data of its predecessor, phi-1, by incorporating additional NLP synthetic texts. This model achieves near state-of-the-art performance on benchmarks for common sense, language understanding, and logical reasoning within the sub-10 billion parameter category.

Key Characteristics

  • Research-Oriented: Released as an open-source model to facilitate research into critical AI safety challenges, such as toxicity reduction, bias understanding, and controllability.
  • Curated Training Data: Excludes generic web-crawl data like Common Crawl to mitigate direct exposure to potentially harmful online content, enhancing safety without relying on RLHF.
  • Versatile Generation: Capable of generating poems, drafting emails, creating stories, summarizing texts, and writing Python code.
  • Base Model: Not fine-tuned for instruction following or reinforcement learning from human feedback, meaning it may produce irrelevant text following main answers and struggle with complex instructions.

Intended Uses

Phi-1.5 is best suited for prompts formatted as:

  • QA Format: Generating answers to questions.
  • Chat Format: Participating in multi-turn conversations.
  • Code Format: Completing or generating code snippets, particularly in Python.

Users should treat generated text and code as starting points, as the model can produce inaccurate outputs and has limitations regarding language comprehension beyond standard English and potential societal biases.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p