ajtaltarabukin2022/deepseekconf

TEXT GENERATIONConcurrency Cost:2Model Size:32BQuant:FP8Ctx Length:32kPublished:Apr 19, 2026Architecture:Transformer Cold

The ajtaltarabukin2022/deepseekconf is a 32 billion parameter language model created by ajtaltarabukin2022 through a merge of pre-trained models using the TIES method. This model integrates components from Sanguineey/Affine-8-5FqLeXxuAqcqZ9aPX5DcaqKadmwpQUqYx5dCBfMT5QPBEx8b, luis1027/affine-5DPY89HQqA1ghQje5KqwYsvubwpG3tFk21KpbEyXK6ZngAn5, and michael-chan-000/affine-5Eh8v9zUpcBwNLRzE3bRv2FFhnaNPERRLdvEH8SdwLiahUh8. Its primary characteristic is its origin as a merged model, leveraging the strengths of its constituent parts for general language understanding and generation tasks.

Loading preview...

Model Overview

The ajtaltarabukin2022/deepseekconf is a 32 billion parameter language model developed by ajtaltarabukin2022. It was created using the MergeKit tool and specifically employed the TIES (Trimmed-mean-based Ensemble of Sub-networks) merge method.

Merge Details

This model is a composite of several pre-trained language models, with Sanguineey/Affine-8-5FqLeXxuAqcqZ9aPX5DcaqKadmwpQUqYx5dCBfMT5QPBEx8b serving as the base model. The merge incorporated contributions from:

  • luis1027/affine-5DPY89HQqA1ghQje5KqwYsvubwpG3tFk21KpbEyXK6ZngAn5
  • michael-chan-000/affine-5Eh8v9zUpcBwNLRzE3bRv2FFhnaNPERRLdvEH8SdwLiahUh8

The merging process assigned specific weights to the layers of each contributing model, with the base model receiving a weight of 0.35, michael-chan-000's model 0.4, and luis1027's model 0.25 across layers 0 to 64. This approach aims to combine the learned representations from different models into a single, more robust model.

Potential Use Cases

As a merged model, ajtaltarabukin2022/deepseekconf is suitable for general-purpose natural language processing tasks where a blend of capabilities from its constituent models is beneficial. Its 32 billion parameters suggest a capacity for complex language understanding and generation.