the-jb/phi-1_5-tofu_full
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.4BQuant:BF16Ctx Length:2kPublished:Apr 15, 2025License:mitArchitecture:Transformer0.0K Open Weights Warm

The-jb/phi-1_5-tofu_full is a 1.4 billion parameter language model, fine-tuned from Microsoft's phi-1_5 architecture. It has been specifically adapted using the full TOFU dataset, making it suitable for tasks requiring factual recall and knowledge-based generation. This model is designed for efficient deployment in applications where a smaller, specialized model is preferred.

Loading preview...

Model Overview

This model, the-jb/phi-1_5-tofu_full, is a 1.4 billion parameter language model derived from Microsoft's phi-1_5 architecture. It has undergone specific fine-tuning on the complete TOFU (Text-Only Factual Understanding) dataset from Locuslab. This process enhances its ability to handle tasks related to factual understanding and knowledge retrieval.

Key Characteristics

  • Base Model: Fine-tuned from microsoft/phi-1_5.
  • Dataset: Utilizes the full locuslab/TOFU dataset for specialized training.
  • Parameter Count: Features 1.4 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a context window of 2048 tokens.
  • License: Inherits the MIT License from its base model.

Use Cases

This model is particularly well-suited for applications that benefit from a compact yet capable language model with enhanced factual recall, such as:

  • Question answering systems focused on factual information.
  • Knowledge-based text generation.
  • Applications requiring efficient deployment of a specialized language model.