Overview
Model Overview
This model, the-jb/phi-1_5-tofu_full, is a 1.4 billion parameter language model derived from Microsoft's phi-1_5 architecture. It has undergone specific fine-tuning on the complete TOFU (Text-Only Factual Understanding) dataset from Locuslab. This process enhances its ability to handle tasks related to factual understanding and knowledge retrieval.
Key Characteristics
- Base Model: Fine-tuned from
microsoft/phi-1_5. - Dataset: Utilizes the full
locuslab/TOFUdataset for specialized training. - Parameter Count: Features 1.4 billion parameters, offering a balance between performance and computational efficiency.
- Context Length: Supports a context window of 2048 tokens.
- License: Inherits the MIT License from its base model.
Use Cases
This model is particularly well-suited for applications that benefit from a compact yet capable language model with enhanced factual recall, such as:
- Question answering systems focused on factual information.
- Knowledge-based text generation.
- Applications requiring efficient deployment of a specialized language model.