ajtaltarabukin2022/merged_beat_champ_2model_dare_conservative
The ajtaltarabukin2022/merged_beat_champ_2model_dare_conservative is a 32 billion parameter language model created by ajtaltarabukin2022, formed by merging pre-trained models using the DARE TIES method. This model combines the strengths of dura-lori/affine-5DoKPQhZmKnFk4mNEmH4UorbqHDe3PFAPvEfJyDwNkimoAMe and fakemoonlo/Affine-5FnfLT3ntQXDsAnVC5H5WNQYVTY7SSCbxU3kxqhNybtJeNGb. With a 32768 token context length, it is designed to leverage the combined knowledge and capabilities of its constituent models for general language tasks.
Loading preview...
Model Overview
The ajtaltarabukin2022/merged_beat_champ_2model_dare_conservative is a 32 billion parameter language model resulting from a merge operation. It was created using mergekit, a tool for combining pre-trained language models.
Merge Details
This model was constructed using the DARE TIES merge method, as described in the paper DARE TIES. The merging process utilized dura-lori/affine-5DoKPQhZmKnFk4mNEmH4UorbqHDe3PFAPvEfJyDwNkimoAMe as the base model.
Constituent Models
The merge incorporated the following models:
- Base Model:
dura-lori/affine-5DoKPQhZmKnFk4mNEmH4UorbqHDe3PFAPvEfJyDwNkimoAMe - Merged Model:
fakemoonlo/Affine-5FnfLT3ntQXDsAnFVC5H5WNQYVTY7SSCbxU3kxqhNybtJeNGb
Configuration
The merge configuration specified a bfloat16 data type and applied a weighted average to the layers of the source models. Specifically, the base model contributed 55% weight and the merged model contributed 45% weight across layers 0 to 64. A density of 0.9 was also applied during the merge process.