Name: Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_linear API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Overview

This model, merge_accfmt_MRL4096_ROLLOUT4_LR2e-6_w0.5_linear, is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via mergekit, combining two distinct pre-trained language models. The merging process involved assigning equal weights (0.5) to each constituent model, aiming to integrate their respective capabilities.

Merge Details

The model integrates the following two base models:

/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR2e-6/global_step_30/actor/huggingface
/local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/acc_MRL4096_ROLLOUT4_LR2e-6/global_step_54/actor/huggingface

This linear combination, with normalized parameters and bfloat16 dtype, suggests an effort to balance and consolidate the strengths of the source models. With a context length of 131072 tokens, it is well-suited for tasks demanding extensive contextual understanding.

Potential Use Cases

Applications requiring a blend of capabilities from the merged base models.
Tasks benefiting from a large context window for processing long documents or conversations.

Overview

Overview

Merge Details

Potential Use Cases

Full Model Card (README)