venkycs/phi-2-instruct
venkycs/phi-2-instruct is a 3 billion parameter instruction-tuned causal language model, fine-tuned from Microsoft's phi-2 architecture. This model specializes in following instructions, having been trained on a filtered Ultrachat200k dataset using the SFT technique. It offers a compact yet capable solution for tasks requiring instruction adherence within its 2048-token context window.
Loading preview...
Model Overview
venkycs/phi-2-instruct is an instruction-tuned language model based on Microsoft's compact yet powerful phi-2 architecture. With 3 billion parameters and a 2048-token context length, this model has been fine-tuned using the Supervised Fine-Tuning (SFT) technique on a filtered version of the Ultrachat200k dataset.
Key Characteristics
- Base Model: Fine-tuned from
microsoft/phi-2. - Training Method: Utilizes Supervised Fine-Tuning (SFT) for instruction adherence.
- Training Data: Leveraged a filtered
ultrachat200kdataset. - Hyperparameters: Trained with a learning rate of 0.0002 over 51967 steps, using Adam optimizer.
Intended Use Cases
This model is suitable for applications requiring a smaller, efficient language model capable of following instructions. Its fine-tuning on an instruction-based dataset suggests proficiency in tasks such as:
- Instruction following
- Basic conversational agents
- Text generation based on prompts
For detailed evaluation and inference examples, users can refer to the provided Inference Notebook. Training metrics are also available on TensorBoard.