arcee-ai/Llama-3-8B-Instruct-Base-Slerp

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 18, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

arcee-ai/Llama-3-8B-Instruct-Base-Slerp is an 8 billion parameter language model developed by arcee-ai, created by merging Meta-Llama-3-8B-Instruct and Meta-Llama-3-8B using a slerp merge method. This model combines the instruction-following capabilities of the instruct variant with the base model's foundational knowledge. It is designed for general-purpose language tasks, leveraging the strengths of both merged Llama 3 components.

Loading preview...

Model Overview

arcee-ai/Llama-3-8B-Instruct-Base-Slerp is an 8 billion parameter language model developed by arcee-ai. It is constructed through a 'slerp' (spherical linear interpolation) merge of two foundational Meta Llama 3 models: meta-llama/Meta-Llama-3-8B-Instruct and meta-llama/Meta-Llama-3-8B.

Key Characteristics

  • Merge Method: Utilizes the slerp merge method via mergekit to combine the weights of the source models.
  • Source Models: Blends the instruction-tuned version of Llama 3 8B with its base counterpart, aiming to inherit both models' strengths.
  • Parameter Configuration: Specific t parameters were applied during the merge, with varying values for self-attention and MLP layers, and a general value of 0.5 for other parameters.
  • Data Type: The model uses bfloat16 for its operations.

Use Cases

This model is suitable for applications requiring a balance between raw language understanding and instruction-following capabilities, typical of general-purpose large language models. Its merged nature suggests potential for robust performance across a variety of text generation and comprehension tasks.