arcee-ai/Llama-3-Medical-JSL-WiNGPT2-SLERP

TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kTool Calling:SupportedPublished:Apr 25, 2024Architecture:Transformer Cold

The arcee-ai/Llama-3-Medical-JSL-WiNGPT2-SLERP is an 8 billion parameter language model, merged from winninghealth/WiNGPT2-Llama-3-8B-Base and johnsnowlabs/JSL-MedLlama-3-8B-v1.0 using the SLERP method. This model is specifically designed for medical applications, leveraging the strengths of its specialized base models. It offers an 8192-token context length, making it suitable for processing extensive medical texts and queries.

Loading preview...

Overview

This model, arcee-ai/Llama-3-Medical-JSL-WiNGPT2-SLERP, is an 8 billion parameter language model created by arcee-ai. It was developed using the MergeKit tool, specifically employing the SLERP (Spherical Linear Interpolation) merge method to combine two specialized Llama-3-8B base models.

Key Components and Specialization

The model integrates capabilities from:

  • winninghealth/WiNGPT2-Llama-3-8B-Base: A base model likely contributing to general language understanding within a medical context.
  • johnsnowlabs/JSL-MedLlama-3-8B-v1.0: A model explicitly fine-tuned for medical applications, indicating a strong focus on healthcare-related language and knowledge.

Merge Configuration

The SLERP merge method was applied with a specific configuration, adjusting parameter values for self-attention and MLP layers across the merged models. This precise merging aims to optimize the combined model's performance for medical tasks, leveraging the strengths of both constituent models. The base model for the merge was johnsnowlabs/JSL-MedLlama-3-8B-v1.0, suggesting a prioritization of its medical domain expertise.

Intended Use

Given its specialized origins, this model is primarily intended for applications requiring robust understanding and generation of medical text. Its 8192-token context length supports processing detailed clinical notes, research papers, or patient information.