Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.1_linear

Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Dec 24, 2025Architecture:Transformer Warm

Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.1_linear is a 1.5 billion parameter language model created by Zachary1150 using the Linear merge method. This model combines two pre-trained language models, specifically weighted to emphasize one over the other. It is designed for general language tasks, leveraging the combined strengths of its constituent models.

Loading preview...

Model Overview

This model, merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.1_linear, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via mergekit, combining two distinct pre-trained models.

Merge Details

The model integrates two base models, with a specific weighting applied during the merge process:

  • The first model, /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/acc_MRL4096_ROLLOUT4_LR2e-6/global_step_54/actor/huggingface, contributed with a weight of 0.1.
  • The second model, /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR2e-6/global_step_30/actor/huggingface, contributed with a higher weight of 0.9.

This configuration suggests an emphasis on the characteristics and capabilities of the second model, while incorporating aspects of the first. The merge was performed with normalize: true and dtype: bfloat16.

Potential Use Cases

Given its merged nature, this model is likely suitable for:

  • General text generation and understanding tasks.
  • Applications where a blend of capabilities from the constituent models is beneficial.
  • Exploration of merged model performance in specific downstream applications.