Name: Zachary1150/merge_linear_len0.3fmt0.7_MRL4096_ROLLOUT4_LR1e-6 API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_linear_len0.3fmt0.7_MRL4096_ROLLOUT4_LR1e-6 is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via MergeKit, combining two distinct pre-trained language models. This merging approach allows for a weighted combination of the source models' characteristics.

Merge Details

This model is a blend of two base models, specifically:

/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/len_MRL4096_ROLLOUT4_LR1e-6/global_step_50/actor/huggingface (contributing 30% weight)
/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/fmt_MRL4096_ROLLOUT4/global_step_50/actor/huggingface (contributing 70% weight)

The merge process utilized a bfloat16 data type and included normalization, as specified in the configuration. The model features a significant context length of 131072 tokens.

Potential Use Cases

Given its merged nature and substantial context window, this model is likely suitable for applications that require:

Extended context processing: Handling long documents, conversations, or codebases.
Specific task performance: Leveraging the combined strengths of its constituent models for particular language understanding or generation tasks, depending on what the 'len' and 'fmt' models were optimized for.

Overview

Model Overview

Merge Details

Potential Use Cases

Full Model Card (README)