Name: Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR1e-6_w0.5_dare_ties API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, merge_cosfmt_MRL4096_ROLLOUT4_LR1e-6_w0.5_dare_ties, is a 1.5 billion parameter language model created by Zachary1150. It is a merge of pre-trained language models, specifically utilizing the DARE TIES merge method, which is detailed in the DARE TIES paper.

Merge Details

The model's foundation is the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B base model. Two distinct actor models from baselines_openrs checkpoints were combined to form this merge:

/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR1e-6/global_step_50/actor/huggingface
/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR1e-6/global_step_50/actor/huggingface

Each contributing model was assigned a weight of 0.5 and a density of 0.5 during the merging process, with normalization applied. The merge was performed using mergekit.

Potential Use Cases

Given its architecture as a merge of specialized actor models, this model is likely suitable for general language generation and understanding tasks, potentially benefiting from the combined strengths of its constituent models.

Overview

Model Overview

Merge Details

Potential Use Cases

Full Model Card (README)