birgermoell/Llama-3-dare_ties

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kLicense:llama2Architecture:Transformer Open Weights Warm

Llama-3-dare_ties is an 8 billion parameter language model created by birgermoell, based on the Meta-Llama-3-8B-Instruct architecture. This model is a merge using the dare_ties method, incorporating Meta-Llama-3-8B and Meta-Llama-3-8B-Instruct. It is designed for general instruction-following tasks, leveraging the strengths of its base models.

Loading preview...

Model Overview

birgermoell/Llama-3-dare_ties is an 8 billion parameter language model derived from the Meta-Llama-3 family. It was created by birgermoell through a merge operation using the dare_ties method, combining meta-llama/Meta-Llama-3-8B and meta-llama/Meta-Llama-3-8B-Instruct.

Key Characteristics

  • Architecture: Based on the robust Llama 3 architecture, providing a strong foundation for various NLP tasks.
  • Parameter Count: Features 8 billion parameters, offering a balance between performance and computational efficiency.
  • Context Length: Supports a context window of 8192 tokens, suitable for processing moderately long inputs.
  • Merge Method: Utilizes the dare_ties merging technique, which combines the weights of the base models to potentially enhance overall performance and instruction-following capabilities.
  • Configuration: The merge specifically weighted Meta-Llama-3-8B-Instruct at 60% with a density of 0.53, indicating a focus on instruction-tuned performance.

Good For

  • Instruction Following: Optimized for tasks requiring adherence to specific instructions, benefiting from the Instruct variant in its merge.
  • General Text Generation: Capable of generating coherent and contextually relevant text across a wide range of topics.
  • Experimentation with Merged Models: Provides a practical example of a dare_ties merge, useful for researchers and developers exploring model combination techniques.

Popular Sampler Settings

Top 3 parameter combinations used by Featherless users for this model. Click a tab to see each config.

temperature
top_p
top_k
frequency_penalty
presence_penalty
repetition_penalty
min_p