Technoculture/BioMistral-Carpybara-Slerp

TEXT GENERATIONConcurrency Cost:1Model Size:7BQuant:FP8Ctx Length:4kPublished:Feb 21, 2024License:apache-2.0Architecture:Transformer Open Weights Cold

Technoculture/BioMistral-Carpybara-Slerp is a 7 billion parameter language model created by Technoculture, formed by merging BioMistral/BioMistral-7B-DARE and argilla/CapybaraHermes-2.5-Mistral-7B using a slerp method. This model leverages the strengths of its base components, particularly in medical and general conversational domains, with a 4096-token context length. It is designed for applications requiring a blend of specialized biomedical knowledge and broad conversational capabilities.

Loading preview...

Model Overview

Technoculture/BioMistral-Carpybara-Slerp is a 7 billion parameter language model developed by Technoculture. It is a merged model, combining the capabilities of two distinct base models: BioMistral/BioMistral-7B-DARE and argilla/CapybaraHermes-2.5-Mistral-7B. The merge was performed using a spherical linear interpolation (slerp) method, specifically configured with varying interpolation values for self-attention and MLP layers.

Key Capabilities

  • Hybrid Knowledge Base: Integrates specialized biomedical knowledge from BioMistral-7B-DARE with the general conversational and instruction-following abilities of CapybaraHermes-2.5-Mistral-7B.
  • Merge Architecture: Utilizes a slerp merge, allowing for a balanced combination of features from its constituent models.
  • Standard Usage: Compatible with Hugging Face Transformers library for text generation tasks, supporting a 4096-token context window.

Intended Use Cases

This model is well-suited for applications that require:

  • Biomedical Question Answering: Leveraging the BioMistral component for tasks related to medical information.
  • General Conversational AI: Benefiting from the CapybaraHermes component for broader dialogue and instruction following.
  • Research and Development: As a base for further fine-tuning or experimentation in hybrid domain-specific and general-purpose LLMs.