Name: Zachary1150/merge_linear_cos0.1fmt0.9_MRL4096_ROLLOUT4_LR1e-6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

This model, Zachary1150/merge_linear_cos0.1fmt0.9_MRL4096_ROLLOUT4_LR1e-6, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via mergekit, combining two distinct pre-trained base models. The merge process specifically weighted one base model at 0.1 and the other at 0.9, indicating a targeted blend of their characteristics.

Key Characteristics

Architecture: A merged model, combining /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/cos_MRL4096_ROLLOUT4_LR1e-6/global_step_50/actor/huggingface and /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/fmt_MRL4096_ROLLOUT4/global_step_50/actor/huggingface.
Merge Method: Utilizes the Linear merge method, as described in the paper "Linear" (arxiv.org/abs/2203.05482).
Parameter Weighting: The merge configuration applied a weight of 0.1 to the 'cos' base model and 0.9 to the 'fmt' base model, suggesting a strong emphasis on the latter's characteristics.
Data Type: The model was merged using bfloat16 precision.
Context Length: Features a substantial context window of 131072 tokens.

Potential Use Cases

This model is particularly suited for applications that can benefit from:

Leveraging the combined strengths of its constituent base models.
Tasks requiring processing and understanding of very long input sequences due to its extended context window.
Scenarios where a specific blend of model capabilities, as defined by the 0.1/0.9 weighting, is advantageous.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)