the-jb/phi-1_5-tofu_full

Cold
Public
1.4B
BF16
2048
1
Apr 15, 2025
License: mit
Hugging Face
Overview

Model Overview

This model, the-jb/phi-1_5-tofu_full, is a 1.4 billion parameter language model derived from Microsoft's phi-1_5 architecture. It has undergone specific fine-tuning on the complete TOFU (Text-Only Factual Understanding) dataset from Locuslab. This process enhances its ability to handle tasks related to factual understanding and knowledge retrieval.

Key Characteristics

  • Base Model: Fine-tuned from microsoft/phi-1_5.
  • Dataset: Utilizes the full locuslab/TOFU dataset for specialized training.
  • Parameter Count: Features 1.4 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a context window of 2048 tokens.
  • License: Inherits the MIT License from its base model.

Use Cases

This model is particularly well-suited for applications that benefit from a compact yet capable language model with enhanced factual recall, such as:

  • Question answering systems focused on factual information.
  • Knowledge-based text generation.
  • Applications requiring efficient deployment of a specialized language model.