Name: Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_ties API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, merge_cosfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_ties, is a 1.5 billion parameter language model created by Zachary1150. It was developed using the TIES (Trimming, Iterative, and Selective) merge method, which combines multiple pre-trained models into a single, more capable model. The base model for this merge is deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B.

Merge Details

Two specific checkpoints were merged to create this model:

/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR2e-6/global_step_40/actor/huggingface
/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR2e-6/global_step_30/actor/huggingface

Each of these constituent models contributed with a weight of 0.5 and a density of 0.5 during the merging process. The merge was configured to normalize parameters and was performed using bfloat16 data type. This approach aims to consolidate the learned representations from the individual models, potentially enhancing overall performance and generalization capabilities.

Overview

Model Overview

Merge Details

Full Model Card (README)