elinas/Llama-3-15B-Instruct-zeroed
The elinas/Llama-3-15B-Instruct-zeroed model is a 15 billion parameter instruction-tuned language model derived from Meta's Llama-3-8B-Instruct, created using a specialized 'passthrough' merge method. This merge technique, which involves zeroing specific projection layers (o_proj and down_proj), resulted in a decreased perplexity compared to other 15B merges. It is designed for general instruction-following tasks, leveraging its unique merging approach to potentially enhance performance.
Loading preview...
Overview
elinas/Llama-3-15B-Instruct-zeroed is a 15 billion parameter instruction-tuned language model. It was created by elinas using the mergekit tool, specifically employing a 'passthrough' merge method. This unique merging approach involved zeroing the o_proj and down_proj layers during the merge process, a technique suggested by Charles Goddard and brought to attention by Toasty Pigeon. This method reportedly led to a decrease in perplexity, indicating improved model performance compared to other 15B merges.
Key Characteristics
- Base Model: Built upon meta-llama/Meta-Llama-3-8B-Instruct.
- Merge Method: Utilizes a 'passthrough' merge with specific layer zeroing (
o_projanddown_proj) to optimize perplexity. - Parameter Count: 15 billion parameters.
- Context Length: Supports an 8192-token context window.
Potential Use Cases
This model is suitable for general instruction-following tasks, benefiting from its optimized merge strategy. A finetuned version, elinas/Llama-3-15B-Instruct-zeroed-ft, is also available, which is noted to offer further performance improvements.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.