Josephgflowers/Tinyllama-616M-Cinder
Josephgflowers/Tinyllama-616M-Cinder is a 1.1 billion parameter language model, derived from TinyLlama, that has been significantly pruned and retrained on a diverse dataset including Reason with Cinder, OpenOrca, ShareGPT, and Tiny Textbooks. This model, reduced to 11 layers, is presented as a base model with emerging response capabilities, utilizing the Zephyr chat format. It is intended for further development and fine-tuning by the community.
Loading preview...
Model Overview
Josephgflowers/Tinyllama-616M-Cinder is a unique iteration of the TinyLlama 1.1B model, specifically engineered by Josephgflowers. Initially, the model's architecture was significantly reduced from 22 layers to 14, and then further pruned to just 11 layers. This aggressive pruning was followed by a targeted retraining process.
Key Characteristics
- Architecture: A heavily pruned TinyLlama variant, reduced to 11 layers.
- Parameter Count: 1.1 billion parameters.
- Training Data: Retrained on a specialized mix including the Reason with Cinder dataset, a subset of OpenOrca, ShareGPT, and Tiny Textbooks.
- Chat Format: Employs the Zephyr chat format for interactions.
- Development Stage: Positioned as a base model with "emerging responses," indicating it's suitable for continued training and fine-tuning.
Intended Use
This model is primarily intended as a base model for further experimentation and training. Developers looking for a compact, pre-processed TinyLlama variant with a unique training history may find this a suitable starting point for specialized applications. Its reduced layer count suggests potential for efficiency, while its diverse retraining aims to provide a foundation for various conversational or reasoning tasks, albeit requiring additional work to achieve coherent output.