jeff31415/TinyLlama-1.1B-1.5T-OpenOrca-Alpha

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Oct 24, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Warm

The jeff31415/TinyLlama-1.1B-1.5T-OpenOrca-Alpha is a 1.1 billion parameter language model, fine-tuned by jeff31415, based on the TinyLlama architecture. It was fine-tuned on the OpenOrca GPT4 subset for one epoch, making it suitable for instruction-following tasks. This model leverages an early version of TinyLlama-1.5T, demonstrating continued performance despite a known dataset processing bug in its base model.

Loading preview...

Model Overview

This model, jeff31415/TinyLlama-1.1B-1.5T-OpenOrca-Alpha, is a 1.1 billion parameter language model fine-tuned by jeff31415. It is built upon an early version of the TinyLlama-1.5T base model, specifically the step-720k-token-1510B checkpoint. Despite a known bug in the base model's dataset processing, this fine-tune aims to demonstrate improved performance.

Key Characteristics

  • Base Model: TinyLlama-1.5T (early version, 1.1B parameters).
  • Fine-tuning Dataset: OpenOrca GPT4 subset, trained for one epoch.
  • Format: Utilizes the CHATML format for instruction following.
  • License: Apache 2.0, inheriting from the TinyLlama base model.
  • Quantization: GGUF format is available for optimized deployment.

Training Details

Training was conducted on a single RTX A5000 GPU, completing one epoch in approximately 16 hours. Further details and metrics from the training run are available on Weights & Biases.

Potential Use Cases

This model is suitable for applications requiring a compact, instruction-following language model, particularly for tasks aligned with the OpenOrca dataset's conversational and reasoning capabilities. Its small size makes it efficient for resource-constrained environments.