TinyLlama/TinyLlama-1.1B-step-50K-105b
TinyLlama/TinyLlama-1.1B-step-50K-105b is an intermediate checkpoint of the TinyLlama project, developed by jzhang38, featuring a 1.1 billion parameter Llama-2-like architecture. This model is designed for pretraining on a massive 3 trillion token dataset, aiming for a compact yet capable language model. It is particularly optimized for applications requiring a restricted computational and memory footprint, offering Llama-compatible integration.
Loading preview...
What is TinyLlama-1.1B-step-50K-105b?
This model is an intermediate checkpoint from the TinyLlama project, which aims to pretrain a 1.1 billion parameter language model on 3 trillion tokens. Developed by jzhang38, it utilizes the exact same architecture and tokenizer as Llama 2, ensuring compatibility with existing Llama-based open-source projects. This specific checkpoint represents 50,000 training steps and has been trained on 105 billion tokens.
Key Characteristics
- Compact Size: With only 1.1 billion parameters, it's designed for efficiency.
- Llama 2 Compatibility: Shares architecture and tokenizer with Llama 2, allowing for seamless integration.
- Ongoing Pretraining: This is an intermediate release, part of a larger project targeting 3 trillion tokens.
- Performance: Achieves a HellaSwag Acc_norm of 43.50 at this stage of training.
Ideal Use Cases
- Resource-Constrained Environments: Suitable for applications with limited computation and memory.
- Llama Ecosystem Integration: Easily plugs into projects built upon the Llama architecture.
- Research and Development: Useful for exploring the capabilities of smaller, Llama-compatible models during their pretraining phase.