Name: Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR5e-7_w0.1_linear is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method from mergekit, combining two distinct pre-trained base models. This merging approach aims to leverage the strengths of its constituent models, which include /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/len_MRL4096_ROLLOUT4_LR5e-7/global_step_30/actor/huggingface and /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface.

Key Characteristics

Merged Architecture: Utilizes a linear merge with specific weighting (0.1 for the 'len' model and 0.9 for the 'accfmt' model) to balance contributions from its base components.
Extended Context Length: Features a notable context window of 131,072 tokens, allowing for processing and generation of very long texts.

Potential Use Cases

Long-form Content Analysis: Ideal for tasks requiring deep understanding and summarization of extensive documents, codebases, or conversations.
Context-heavy Generation: Suitable for generating coherent and contextually relevant text over many thousands of tokens.
Research and Experimentation: Provides a merged model for researchers exploring the effects of linear merging on models with specific formatting and length-related pre-training.

Overview

Model Overview

Key Characteristics

Potential Use Cases

Full Model Card (README)