Name: Zachary1150/merge_linear_cos0.3fmt0.7_MRL4096_ROLLOUT4_LR1e-6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, Zachary1150/merge_linear_cos0.3fmt0.7_MRL4096_ROLLOUT4_LR1e-6, is a 1.5 billion parameter language model developed by Zachary1150. It was created using the Linear merge method via mergekit, combining two distinct pre-trained base models.

Merge Details

The model is a weighted linear merge of two base models, specifically:

A model from /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR1e-6/global_step_50/actor/huggingface with a weight of 0.3.
A model from /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/fmt_MRL4096_ROLLOUT4/global_step_50/actor/huggingface with a weight of 0.7.

This specific weighting strategy suggests an intent to balance or combine the characteristics of the two source models, with a stronger emphasis on the fmt component. The merge process also included normalization and was performed using bfloat16 precision.

Key Characteristics

Parameter Count: 1.5 billion parameters.
Context Length: Supports an extensive context window of 131072 tokens, enabling processing of very long inputs.
Merge Method: Utilizes a linear merge, a technique known for combining the strengths of multiple models by averaging their weights.

Potential Use Cases

Given its large context window and merged architecture, this model could be suitable for applications requiring:

Processing and understanding lengthy documents or conversations.
Tasks that benefit from a blend of capabilities from its constituent base models, potentially in areas like reasoning or text generation, depending on the nature of the cos and fmt models.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)