Azazelle/Mocha-Dare-7b-ex
Azazelle/Mocha-Dare-7b-ex is a 7 billion parameter language model based on the Mistral-7B-v0.1 architecture, created by Azazelle through a DARE TIES merge. This model integrates capabilities from Open-Orca/Mistral-7B-OpenOrca, akjindal53244/Mistral-7B-v0.1-Open-Platypus, and WizardLM/WizardMath-7B-V1.1. It is designed to combine general instruction following with enhanced mathematical reasoning and conversational abilities.
Loading preview...
Overview
Mocha-Dare-7b-ex is a 7 billion parameter language model developed by Azazelle, built upon the mistralai/Mistral-7B-v0.1 base model. It was created using the DARE TIES merge method, which combines the strengths of multiple pre-trained models. This merging technique allows for the integration of diverse capabilities into a single, cohesive model.
Key Capabilities
- Instruction Following: Inherits general instruction-following abilities from
Open-Orca/Mistral-7B-OpenOrca. - Conversational & Reasoning: Benefits from the
akjindal53244/Mistral-7B-v0.1-Open-Platypuscomponent, enhancing its conversational and reasoning skills. - Mathematical Reasoning: Incorporates
WizardLM/WizardMath-7B-V1.1to improve its performance on mathematical tasks and problem-solving.
Merge Details
The model was constructed using mergekit with the DARE TIES method. The specific configuration involved density and weight gradients for each merged model, optimizing the contribution of each component. This approach aims to create a versatile model that excels across various domains, particularly in areas requiring both general intelligence and specialized mathematical understanding.