Sina-Thor-7b-Merge: An Experimental DARE Merge
Sina-Thor-7b-Merge is a 7 billion parameter language model developed by Azazelle, built upon the foundational Mistral-7B-v0.1 architecture. This model is an experimental DARE (Dropout-Aware Rank-reduced Ensemble) merge, combining several distinct models to potentially enhance performance and capabilities.
Key Merge Components:
- Base Model: mistralai/Mistral-7B-v0.1
- Merged Models:
- rishiraj/smol-7b (weight: 0.2, density: 0.41)
- SanjiWatsuki/openchat-3.5-1210-starling-slerp (weight: 0.33, density: 0.54)
- Azazelle/Dumb-Maidlet (weight: 0.53, density: 0.71)
Technical Details:
The merge process utilizes the dare_ties method, incorporating int8_mask for potential efficiency and specifying bfloat16 as the data type. This approach aims to leverage the strengths of each contributing model in a structured, experimental manner.
Good for:
- Experimentation with DARE merges: Ideal for researchers and developers interested in exploring the effects and performance of DARE merging techniques.
- General language generation: Suitable for a variety of text-based tasks, benefiting from the diverse origins of its merged components.
- Building upon Mistral-7B: Offers a modified base for projects that typically use Mistral-7B, potentially providing different response characteristics.