Name: Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_dare_ties_density0.2 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, developed by Zachary1150, is a 1.5 billion parameter language model created through a merging process. It is built upon the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B base model.

Merge Details

The model was constructed using the DARE TIES (Disentangled Attribute Representation and Embedding for Text) merge method, as described in the paper https://arxiv.org/abs/2311.03099. This technique combines the weights of multiple pre-trained models to create a new, unified model.

Constituent Models

Two specific checkpoints from baselines_openrs were merged to form this model:

/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface
/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface

Each constituent model contributed with a weight of 0.5 and a density of 0.2 during the merge process, with normalization applied. The merging was performed using mergekit.

Overview

Model Overview

Merge Details

Constituent Models

Full Model Card (README)