Name: Zachary1150/merge_linear_len0.7fmt0.3_MRL4096_ROLLOUT4_LR1e-6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_linear_len0.7fmt0.3_MRL4096_ROLLOUT4_LR1e-6 is a 1.5 billion parameter language model developed by Zachary1150. It was created using the Linear merge method via mergekit, combining two distinct pre-trained language models. This merging approach allows for the weighted integration of capabilities from its base components.

Merge Details

The model is a linear merge of two specific base models, with a configuration that assigns a 70% weight to the model located at /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/len_MRL4096_ROLLOUT4_LR1e-6/global_step_50/actor/huggingface and a 30% weight to the model at /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/fmt_MRL4096_ROLLOUT4/global_step_50/actor/huggingface. The merge process also included normalization and utilized bfloat16 data type.

Key Characteristics

Parameter Count: 1.5 billion parameters.
Context Length: Features an extended context window of 131072 tokens.
Merge Method: Employs the Linear merge method for combining base models.
Weighted Integration: Combines two base models with a 70/30 weight distribution, suggesting a focus on specific characteristics from each component.

Potential Use Cases

Given its large context window and specialized merging, this model could be suitable for applications requiring:

Processing and understanding very long documents or conversations.
Tasks benefiting from a blend of capabilities from its constituent base models, particularly if 'len' and 'fmt' refer to length-based and format-based optimizations respectively.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)