Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Dec 20, 2025Architecture:Transformer Warm

Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear is a 1.5 billion parameter language model created by Zachary1150 using the Linear merge method. This model combines two pre-trained language models, specifically focusing on actor checkpoints from baselines_openrs. It is designed for general language tasks, leveraging the combined strengths of its constituent models.

Loading preview...

Model Overview

This model, merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via mergekit, combining the weights of two distinct pre-trained language models.

Merge Details

The merge process specifically integrated two actor checkpoints from the /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/ directory:

  • cos_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface (weighted at 0.1)
  • accfmt_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface (weighted at 0.9)

This configuration, using bfloat16 data type and normalized parameters, aims to leverage the complementary strengths of the merged components for improved performance in general language understanding and generation tasks. The model's architecture and specific capabilities are derived from the combined properties of its source models.