Overview
elinas/Llama-3-15B-Instruct-zeroed is a 15 billion parameter instruction-tuned language model. It was created by elinas using the mergekit tool, specifically employing a 'passthrough' merge method. This unique merging approach involved zeroing the o_proj and down_proj layers during the merge process, a technique suggested by Charles Goddard and brought to attention by Toasty Pigeon. This method reportedly led to a decrease in perplexity, indicating improved model performance compared to other 15B merges.
Key Characteristics
- Base Model: Built upon meta-llama/Meta-Llama-3-8B-Instruct.
- Merge Method: Utilizes a 'passthrough' merge with specific layer zeroing (
o_projanddown_proj) to optimize perplexity. - Parameter Count: 15 billion parameters.
- Context Length: Supports an 8192-token context window.
Potential Use Cases
This model is suitable for general instruction-following tasks, benefiting from its optimized merge strategy. A finetuned version, elinas/Llama-3-15B-Instruct-zeroed-ft, is also available, which is noted to offer further performance improvements.