arcee-ai/Patent-Base-Llama-2-Chat-7B-Slerp
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Apr 14, 2024Architecture:Transformer Cold

The arcee-ai/Patent-Base-Llama-2-Chat-7B-Slerp is a 7 billion parameter language model created by arcee-ai, merged using the SLERP method. It combines the general chat capabilities of Llama-2-7b-chat-hf with the specialized knowledge of arcee-ai/Patent-Base-7b. This model is designed to leverage both broad conversational understanding and domain-specific expertise, particularly in patent-related contexts, with a context length of 4096 tokens.

Loading preview...

Model Overview

The arcee-ai/Patent-Base-Llama-2-Chat-7B-Slerp is a 7 billion parameter language model developed by arcee-ai. It was created using the SLERP (Spherical Linear Interpolation) merge method, combining two distinct base models to achieve a specialized capability.

Key Components and Merge Details

This model is a merge of:

  • daryl149/llama-2-7b-chat-hf: Providing general conversational and instruction-following abilities based on the Llama 2 architecture.
  • arcee-ai/Patent-Base-7b: Contributing domain-specific knowledge, likely focused on patent-related information and language.

The SLERP merge method was applied across all 32 layers of both models, with specific parameter weighting for self-attention and MLP layers to balance their contributions. The merge was performed using mergekit and configured to output in bfloat16 precision.

Intended Use Cases

This model is particularly well-suited for applications requiring:

  • General conversational AI: Leveraging the base Llama 2 chat capabilities.
  • Patent-related text processing: Benefiting from the specialized Patent-Base-7b component.
  • Hybrid tasks: Where both broad understanding and specific domain knowledge in patents are advantageous.