TinyLlama/TinyLlama-1.1B-step-50K-105b

TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Sep 1, 2023License:apache-2.0Architecture:Transformer0.1K Open Weights Cold

TinyLlama/TinyLlama-1.1B-step-50K-105b is an intermediate checkpoint of the TinyLlama project, developed by jzhang38, featuring a 1.1 billion parameter Llama-2-like architecture. This model is designed for pretraining on a massive 3 trillion token dataset, aiming for a compact yet capable language model. It is particularly optimized for applications requiring a restricted computational and memory footprint, offering Llama-compatible integration.

Loading preview...

What is TinyLlama-1.1B-step-50K-105b?

This model is an intermediate checkpoint from the TinyLlama project, which aims to pretrain a 1.1 billion parameter language model on 3 trillion tokens. Developed by jzhang38, it utilizes the exact same architecture and tokenizer as Llama 2, ensuring compatibility with existing Llama-based open-source projects. This specific checkpoint represents 50,000 training steps and has been trained on 105 billion tokens.

Key Characteristics

  • Compact Size: With only 1.1 billion parameters, it's designed for efficiency.
  • Llama 2 Compatibility: Shares architecture and tokenizer with Llama 2, allowing for seamless integration.
  • Ongoing Pretraining: This is an intermediate release, part of a larger project targeting 3 trillion tokens.
  • Performance: Achieves a HellaSwag Acc_norm of 43.50 at this stage of training.

Ideal Use Cases

  • Resource-Constrained Environments: Suitable for applications with limited computation and memory.
  • Llama Ecosystem Integration: Easily plugs into projects built upon the Llama architecture.
  • Research and Development: Useful for exploring the capabilities of smaller, Llama-compatible models during their pretraining phase.