nota-ai/st-vicuna-v1.3-5.5b-taylor
The nota-ai/st-vicuna-v1.3-5.5b-taylor model, developed by Nota AI, is a 5.5 billion parameter depth-pruned version of the Vicuna-v1.3-7B large language model. It utilizes a one-shot pruning method based on identifying unimportant Transformer blocks and light LoRA-based retraining, specifically using the Taylor+ pruning criterion. This model is designed for efficient text generation by reducing the original 7B parameters by 20% while aiming to maintain performance. It is intended for research and non-commercial projects requiring a more compact LLM.
Loading preview...
Model Overview
The nota-ai/st-vicuna-v1.3-5.5b-taylor is a 5.5 billion parameter language model developed by Nota AI. It is a depth-pruned version of the Vicuna-v1.3-7B model, specifically optimized for efficient text generation. This model achieves a 20% reduction in parameters from its 7B base model by identifying and pruning unimportant Transformer blocks.
Key Characteristics
- Depth Pruning: Employs a one-shot pruning technique combined with light LoRA-based retraining to reduce model size.
- Taylor+ Criterion: Utilizes the Taylor+ criterion for pruning, which helps in identifying and removing less critical layers.
- Efficiency Focused: Designed to offer a more compact alternative to larger LLMs, making it suitable for environments with resource constraints.
- Non-Commercial License: Intended strictly for research and non-commercial projects.
Use Cases
This model is particularly well-suited for:
- Research on Model Compression: Ideal for researchers exploring methods of making large language models more efficient.
- Resource-Constrained Deployments: Suitable for applications where a smaller model footprint is critical, provided the performance trade-offs are acceptable.
- Non-Commercial Applications: Can be used in academic or personal projects that do not involve commercial use.
For more technical details, refer to the associated paper: Shortened LLaMA: A Simple Depth Pruning for Large Language Models.