rmihaylov/Llama-3-DARE-v1-8B
The rmihaylov/Llama-3-DARE-v1-8B is an 8 billion parameter language model with an 8192 token context length, created by rmihaylov. This model is a merge of pre-trained language models, specifically Meta-Llama-3-8B and Meta-Llama-3-8B-Instruct, utilizing the DARE TIES merge method. It is designed to combine the strengths of its base models, offering enhanced performance for instruction-following tasks.
Loading preview...
Model Overview
rmihaylov/Llama-3-DARE-v1-8B is an 8 billion parameter language model built upon the Llama 3 architecture, featuring an 8192 token context window. This model was developed by rmihaylov through a merging process of existing pre-trained models.
Merge Details
This model was created using the DARE TIES merge method, a technique designed to combine the capabilities of multiple language models. The base model for this merge was meta-llama/Meta-Llama-3-8B, and it was merged with meta-llama/Meta-Llama-3-8B-Instruct. This approach aims to leverage the foundational knowledge of the base Llama 3 model while integrating the instruction-following capabilities of the instruct-tuned variant.
Key Characteristics
- Architecture: Llama 3 family, 8 billion parameters.
- Context Length: Supports an 8192 token context window.
- Merge Method: Utilizes the DARE TIES method for combining models, which involves specific density and weight parameters for different layers.
- Base Models: Merges
meta-llama/Meta-Llama-3-8Bandmeta-llama/Meta-Llama-3-8B-Instructto create a unified model.
Potential Use Cases
Given its foundation in Llama 3 and the inclusion of an instruct-tuned model, rmihaylov/Llama-3-DARE-v1-8B is likely suitable for a range of applications requiring robust language understanding and generation, particularly those benefiting from instruction-following capabilities.
Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.