shanchen/llama3-8B-slerp-biomed-chat-chinese

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:8BQuant:FP8Ctx Length:8kPublished:Apr 30, 2024License:llama3Architecture:Transformer0.0K Warm

shanchen/llama3-8B-slerp-biomed-chat-chinese is an 8 billion parameter language model merged from shanchen/llama3-8B-slerp-med-chinese and shenzhi-wang/Llama3-8B-Chinese-Chat using the slerp method. This model is specifically designed for biomedical chat applications in Chinese, combining medical domain knowledge with general Chinese conversational abilities. It leverages the Llama3 architecture and is optimized for bfloat16 precision.

Loading preview...

Model Overview

shanchen/llama3-8B-slerp-biomed-chat-chinese is an 8 billion parameter language model created by merging two specialized Llama3-based models: shanchen/llama3-8B-slerp-med-chinese and shenzhi-wang/Llama3-8B-Chinese-Chat. The merge was performed using the slerp (spherical linear interpolation) method via LazyMergekit, combining their respective strengths.

Key Capabilities

  • Biomedical Domain Expertise: Inherits medical knowledge from llama3-8B-slerp-med-chinese, making it suitable for tasks requiring biomedical understanding.
  • Chinese Conversational Fluency: Benefits from Llama3-8B-Chinese-Chat for robust general-purpose Chinese chat capabilities.
  • Merged Architecture: Utilizes a slerp merge strategy with specific parameter weighting for self-attention and MLP layers, aiming for a balanced integration of features.
  • Llama3 Foundation: Built upon the Llama3 architecture, providing a strong base for language understanding and generation.

Good For

  • Chinese Biomedical Chatbots: Ideal for developing conversational AI agents that can discuss medical topics in Chinese.
  • Medical Information Retrieval: Can be used to process and generate responses related to health and medicine in a chat format.
  • Research and Development: Provides a specialized base model for further fine-tuning on specific Chinese biomedical NLP tasks.