Name: Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_dare_ties API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Overview

This model, Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_dare_ties, is a 1.5 billion parameter language model developed by Zachary1150. It was created using the DARE TIES merge method, as described in the paper DARE TIES, and is built upon the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B as its base model.

Merge Details

The model integrates two distinct pre-trained language models, specifically:

/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/acc_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface
/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface

Each merged model contributed with a weight of 0.5 and a density of 0.5, with the overall merge process normalizing the parameters. This configuration aims to combine the strengths of the constituent models, potentially leading to improved performance in various language tasks. The model supports a context length of 131072 tokens.

Overview

Overview

Merge Details

Full Model Card (README)