Zachary1150/merge_linear_len0.5fmt0.5_MRL4096_ROLLOUT4_LR1e-6
Hugging Face
TEXT GENERATIONConcurrency Cost:1Model Size:1.5BQuant:BF16Ctx Length:32kArchitecture:Transformer Warm

Zachary1150/merge_linear_len0.5fmt0.5_MRL4096_ROLLOUT4_LR1e-6 is a 1.5 billion parameter language model created by Zachary1150 using the Linear merge method. This model combines two pre-trained base models, specifically targeting capabilities related to length and format, with a context length of 131072 tokens. It is designed for tasks requiring a balance between these two aspects, leveraging the strengths of its constituent models.

Loading preview...

Model Overview

This model, Zachary1150/merge_linear_len0.5fmt0.5_MRL4096_ROLLOUT4_LR1e-6, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via MergeKit, combining two distinct base models. The merge specifically weighted each base model equally (0.5 for each), indicating an intent to balance their respective capabilities.

Key Characteristics

  • Architecture: Merged from two pre-trained language models.
  • Merge Method: Utilizes the Linear merge method, which combines model weights directly.
  • Parameter Count: 1.5 billion parameters.
  • Context Length: Supports a substantial context window of 131072 tokens.
  • Base Models: Composed of two internal base models, one focused on "len" (length) and the other on "fmt" (format), suggesting specialized training or optimization in these areas.

Potential Use Cases

Given its merged nature and the implied focus of its base models, this model could be suitable for applications requiring:

  • Text generation with specific length constraints.
  • Output formatting adherence.
  • Tasks benefiting from a blend of length and format control.