TinyLlama/TinyLlama-1.1B-intermediate-step-240k-503b

TEXT GENERATIONConcurrency Cost:1Model Size:1.1BQuant:BF16Ctx Length:2kPublished:Sep 16, 2023License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

TinyLlama/TinyLlama-1.1B-intermediate-step-240k-503b is an intermediate checkpoint of the TinyLlama project, a 1.1 billion parameter Llama-2 architecture model developed by jzhang38. This model is pretrained on 503 billion tokens over 240,000 steps, aiming for a total of 3 trillion tokens. Designed for compactness, it offers a small footprint suitable for applications with restricted computation and memory, though this specific version is not recommended for direct inference.

Loading preview...

TinyLlama-1.1B-intermediate-step-240k-503b Overview

This model is an intermediate checkpoint from the TinyLlama project, which aims to pretrain a 1.1 billion parameter Llama model on an extensive 3 trillion tokens. Developed by jzhang38, the project leverages the exact architecture and tokenizer of Llama 2, ensuring compatibility with existing Llama-based open-source tools.

Key Characteristics

  • Architecture: Based on the Llama 2 architecture, allowing for seamless integration into Llama-compatible ecosystems.
  • Parameter Count: A compact 1.1 billion parameters, making it suitable for environments with limited computational resources and memory.
  • Training Progress: This specific release represents an intermediate stage, having been trained for 240,000 steps on 503 billion tokens.
  • Training Goal: The overarching project goal is to reach 3 trillion tokens of pretraining data.

Important Note

This intermediate checkpoint is explicitly not recommended for direct inference. Users are advised to utilize the dedicated chat model for practical applications. This model serves primarily as a developmental snapshot within the larger TinyLlama pretraining effort.