Technoculture/BioMistral-Carpybara-Slerp
Technoculture/BioMistral-Carpybara-Slerp is a 7 billion parameter language model created by Technoculture, formed by merging BioMistral/BioMistral-7B-DARE and argilla/CapybaraHermes-2.5-Mistral-7B using a slerp method. This model leverages the strengths of its base components, particularly in medical and general conversational domains, with a 4096-token context length. It is designed for applications requiring a blend of specialized biomedical knowledge and broad conversational capabilities.
Loading preview...
Model Overview
Technoculture/BioMistral-Carpybara-Slerp is a 7 billion parameter language model developed by Technoculture. It is a merged model, combining the capabilities of two distinct base models: BioMistral/BioMistral-7B-DARE and argilla/CapybaraHermes-2.5-Mistral-7B. The merge was performed using a spherical linear interpolation (slerp) method, specifically configured with varying interpolation values for self-attention and MLP layers.
Key Capabilities
- Hybrid Knowledge Base: Integrates specialized biomedical knowledge from BioMistral-7B-DARE with the general conversational and instruction-following abilities of CapybaraHermes-2.5-Mistral-7B.
- Merge Architecture: Utilizes a slerp merge, allowing for a balanced combination of features from its constituent models.
- Standard Usage: Compatible with Hugging Face Transformers library for text generation tasks, supporting a 4096-token context window.
Intended Use Cases
This model is well-suited for applications that require:
- Biomedical Question Answering: Leveraging the BioMistral component for tasks related to medical information.
- General Conversational AI: Benefiting from the CapybaraHermes component for broader dialogue and instruction following.
- Research and Development: As a base for further fine-tuning or experimentation in hybrid domain-specific and general-purpose LLMs.