stevez80/ErebusNeuralSamir-7B-dare-ties
stevez80/ErebusNeuralSamir-7B-dare-ties is a 7 billion parameter language model created by stevez80, built upon the Mistral-7B-v0.1 architecture. This model is a DARE TIES merge of SamirGPT-v1, NeuralHermes-2.5-Mistral-7B, and Mistral-7B-Erebus-v3, designed to combine the strengths of its constituent models. It features a 4096-token context length and is configured with int8_mask and bfloat16 dtype for efficient operation.
Loading preview...
ErebusNeuralSamir-7B-dare-ties Overview
ErebusNeuralSamir-7B-dare-ties is a 7 billion parameter language model developed by stevez80. It is constructed using the DARE TIES merging method, combining three distinct models based on the Mistral-7B-v0.1 architecture: samir-fama/SamirGPT-v1, mlabonne/NeuralHermes-2.5-Mistral-7B, and KoboldAI/Mistral-7B-Erebus-v3. This merging approach aims to leverage the unique characteristics and capabilities of each base model.
Key Configuration Details
- Base Model: mistralai/Mistral-7B-v0.1
- Merge Method: DARE TIES, with specific density and weight parameters applied to each merged component.
- Merged Models:
- samir-fama/SamirGPT-v1 (density: 0.53, weight: 0.3)
- mlabonne/NeuralHermes-2.5-Mistral-7B (density: 0.53, weight: 0.3)
- KoboldAI/Mistral-7B-Erebus-v3 (density: 0.53, weight: 0.4)
- Parameters: Includes
int8_maskfor potential quantization benefits and usesbfloat16for numerical precision and efficiency.
Potential Use Cases
Given its merged nature, this model is likely suitable for applications requiring a blend of the strengths from its constituent models. Developers looking for a 7B parameter model with a 4096-token context length, built from a combination of well-regarded Mistral-7B fine-tunes, may find this model useful for various general-purpose language generation tasks.