nota-ai/st-llama-1-5.5b-taylor
The nota-ai/st-llama-1-5.5b-taylor model, developed by Nota AI, is a 5.5 billion parameter depth-pruned version of the LLaMA-1-7B model, optimized for efficient text generation. This model utilizes a Taylor+ criterion for pruning, reducing its size while aiming to maintain performance. It is designed for research and non-commercial projects requiring a more compact LLaMA-based language model.
Loading preview...
Model Overview
The nota-ai/st-llama-1-5.5b-taylor model is a 5.5 billion parameter language model developed by Nota AI. It is a depth-pruned variant of the original LLaMA-1-7B model, created through a process that identifies and removes less important Transformer blocks. This specific version uses the Taylor+ criterion during its one-shot pruning and light LoRA-based retraining to achieve a 20% reduction in parameters from its 7B base model.
Key Characteristics
- Efficient Text Generation: Designed for more efficient text generation by reducing model size through depth pruning.
- Pruning Method: Employs a novel depth-pruning technique combined with LoRA-based retraining, specifically using the Taylor+ criterion for block removal.
- Base Model: Derived from the LLaMA-1-7B architecture.
- Parameter Count: Reduced to 5.5 billion parameters from the original 7 billion, offering a more compact footprint.
Intended Use Cases
This model is primarily intended for:
- Research Projects: Exploring the effects of structured pruning on large language models.
- Non-Commercial Applications: Developing and experimenting with LLMs where efficiency and a smaller model size are beneficial.
- Comparative Studies: Benchmarking the performance of depth-pruned models against their larger counterparts.