The arcee-ai/Patent-Base-Llama-2-Chat-7B-Slerp is a 7 billion parameter language model created by arcee-ai, merged using the SLERP method. It combines the general chat capabilities of Llama-2-7b-chat-hf with the specialized knowledge of arcee-ai/Patent-Base-7b. This model is designed to leverage both broad conversational understanding and domain-specific expertise, particularly in patent-related contexts, with a context length of 4096 tokens.
Loading preview...
Model Overview
The arcee-ai/Patent-Base-Llama-2-Chat-7B-Slerp is a 7 billion parameter language model developed by arcee-ai. It was created using the SLERP (Spherical Linear Interpolation) merge method, combining two distinct base models to achieve a specialized capability.
Key Components and Merge Details
This model is a merge of:
daryl149/llama-2-7b-chat-hf: Providing general conversational and instruction-following abilities based on the Llama 2 architecture.arcee-ai/Patent-Base-7b: Contributing domain-specific knowledge, likely focused on patent-related information and language.
The SLERP merge method was applied across all 32 layers of both models, with specific parameter weighting for self-attention and MLP layers to balance their contributions. The merge was performed using mergekit and configured to output in bfloat16 precision.
Intended Use Cases
This model is particularly well-suited for applications requiring:
- General conversational AI: Leveraging the base Llama 2 chat capabilities.
- Patent-related text processing: Benefiting from the specialized
Patent-Base-7bcomponent. - Hybrid tasks: Where both broad understanding and specific domain knowledge in patents are advantageous.