Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kPublished:Dec 20, 2025Architecture:Transformer Warm

Zachary1150/merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear is a 1.5 billion parameter language model created by Zachary1150 using a linear merge method. This model combines two pre-trained language models, specifically focusing on actor checkpoints from baselines_openrs. It is designed for general language tasks, leveraging the combined strengths of its constituent models.

Loading preview...

Model Overview

This model, merge_cosfmt_MRL4096_ROLLOUT4_LR5e-7_w0.7_linear, is a 1.5 billion parameter language model developed by Zachary1150. It was created using the mergekit tool, specifically employing the Linear merge method.

Merge Details

The model is a composite of two distinct pre-trained language models, both originating from actor checkpoints within the baselines_openrs project. The merge configuration assigned a weight of 0.7 to the cos_MRL4096_ROLLOUT4_LR5e-7 model and 0.3 to the accfmt_MRL4096_ROLLOUT4_LR5e-7 model, with normalization applied and bfloat16 as the data type.

Key Characteristics

  • Architecture: Merged model based on existing pre-trained language models.
  • Parameter Count: 1.5 billion parameters.
  • Merge Method: Linear merging, which combines the weights of multiple models.
  • Context Length: Supports a context length of 131072 tokens.

Potential Use Cases

This model is suitable for general language generation and understanding tasks where the combined capabilities of its constituent models are beneficial. Its large context window may be advantageous for applications requiring extensive contextual understanding.