Name: Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.3_linear API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.3_linear, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via the mergekit tool, combining the weights of two distinct pre-trained models.

Merge Details

The model integrates two base models:

/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface
/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface

These models were merged with a specific weighting: the first model received a weight of 0.3, and the second model received a weight of 0.7. The merge process also included normalization and utilized bfloat16 for its data type. This approach aims to combine the learned representations and capabilities of the constituent models into a single, more robust language model, offering a notable 131,072 token context window.

Overview

Model Overview

Merge Details

Full Model Card (README)