Name: Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR2e-6_w0.1_linear API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR2e-6_w0.1_linear is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via mergekit, combining two distinct pre-trained models.

Merge Details

This model is a blend of two base models, with specific weighting applied during the merge process:

Model 1 (Weight 0.1): /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/len_MRL4096_ROLLOUT4_LR2e-6/global_step_40/actor/huggingface
Model 2 (Weight 0.9): /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR2e-6/global_step_30/actor/huggingface

The merge configuration utilized a linear method with normalize: true and dtype: bfloat16. This specific weighting suggests an emphasis on the characteristics of the second model, likely related to accuracy formatting, while incorporating aspects of the first model, potentially related to length formatting.

Key Characteristics

Merged Architecture: Combines two specialized base models to achieve a hybrid performance profile.
Linear Merge Method: Utilizes a straightforward, weighted averaging approach for model merging.
Parameter Count: A compact 1.5 billion parameters, making it suitable for applications where efficiency is a concern.
Extended Context Length: Features a substantial context window of 131072 tokens, enabling it to process and generate very long texts.

Overview

Model Overview

Merge Details

Key Characteristics

Full Model Card (README)