venkycs/phi-2-instruct

TEXT GENERATIONConcurrency Cost:1Model Size:3BQuant:BF16Ctx Length:2kPublished:Dec 14, 2023License:microsoft-research-licenseArchitecture:Transformer0.0K Cold

venkycs/phi-2-instruct is a 3 billion parameter instruction-tuned causal language model, fine-tuned from Microsoft's phi-2 architecture. This model specializes in following instructions, having been trained on a filtered Ultrachat200k dataset using the SFT technique. It offers a compact yet capable solution for tasks requiring instruction adherence within its 2048-token context window.

Loading preview...

Model Overview

venkycs/phi-2-instruct is an instruction-tuned language model based on Microsoft's compact yet powerful phi-2 architecture. With 3 billion parameters and a 2048-token context length, this model has been fine-tuned using the Supervised Fine-Tuning (SFT) technique on a filtered version of the Ultrachat200k dataset.

Key Characteristics

  • Base Model: Fine-tuned from microsoft/phi-2.
  • Training Method: Utilizes Supervised Fine-Tuning (SFT) for instruction adherence.
  • Training Data: Leveraged a filtered ultrachat200k dataset.
  • Hyperparameters: Trained with a learning rate of 0.0002 over 51967 steps, using Adam optimizer.

Intended Use Cases

This model is suitable for applications requiring a smaller, efficient language model capable of following instructions. Its fine-tuning on an instruction-based dataset suggests proficiency in tasks such as:

  • Instruction following
  • Basic conversational agents
  • Text generation based on prompts

For detailed evaluation and inference examples, users can refer to the provided Inference Notebook. Training metrics are also available on TensorBoard.