Kabster/Bio-Mistralv2-Squared
TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Mar 9, 2024License:apache-2.0Architecture:Transformer0.0K Open Weights Cold

Kabster/Bio-Mistralv2-Squared is a 7 billion parameter language model merged from BioMistral/BioMistral-7B and mistralai/Mistral-7B-Instruct-v0.2 using the SLERP method. This model combines the general instruction-following capabilities of Mistral-7B-Instruct-v0.2 with the specialized biomedical knowledge of BioMistral-7B. It is designed for applications requiring both broad language understanding and specific expertise in biomedical domains, operating with a 4096-token context length.

Loading preview...

Overview

Kabster/Bio-Mistralv2-Squared is a 7 billion parameter language model created by merging two distinct pre-trained models: BioMistral/BioMistral-7B and mistralai/Mistral-7B-Instruct-v0.2. This merge was performed using the SLERP (Spherical Linear Interpolation) method via mergekit, aiming to combine their respective strengths.

Key Capabilities

  • Hybrid Knowledge Base: Integrates the general instruction-following abilities of Mistral-7B-Instruct-v0.2 with the specialized biomedical knowledge from BioMistral-7B.
  • Instruction Following: Benefits from the instruction-tuned nature of Mistral-7B-Instruct-v0.2, making it suitable for various prompt-based tasks.
  • Biomedical Specialization: Inherits domain-specific understanding from BioMistral-7B, enhancing its performance on biomedical texts and queries.

Merge Details

The model was constructed by applying SLERP across all 32 layers of both base models. The t parameter in the merge configuration was varied for self_attn and mlp layers, indicating a nuanced interpolation strategy to balance the contributions of each base model. The base model for the merge operation was specified as BioMistral/BioMistral-7B.

Good For

  • Applications requiring a blend of general conversational AI and specific biomedical expertise.
  • Tasks involving medical text analysis, drug information, biological research, or clinical question answering where instruction following is also crucial.
  • Developers looking for a model that can handle both common language tasks and specialized scientific queries within the biomedical field.