Name: Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_ties API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.5_ties is a 1.5 billion parameter language model created by Zachary1150. It is built upon the deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B base model and utilizes the TIES merge method to combine the capabilities of two distinct pre-trained language models.

Merge Details

This model was constructed using mergekit, specifically employing the TIES (Trimmed, Iterative, and Selective) merging technique. The TIES method allows for the intelligent combination of multiple models, aiming to preserve and enhance their individual strengths. The two models merged were:

/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface
/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface

Each of these constituent models contributed with a weight of 0.5 and a density of 0.5 during the merge process, with normalization applied. This approach aims to create a balanced model that integrates the learned representations from both sources effectively.

Potential Use Cases

Given its foundation and merging strategy, this model is suitable for general language generation and understanding tasks where a 1.5B parameter model is appropriate. Its merged nature suggests it might exhibit a broader range of capabilities than its individual components, making it a versatile option for various NLP applications.

Overview

Model Overview

Merge Details

Potential Use Cases

Full Model Card (README)