Name: Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.3_linear API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_accfmt_MRL4096_ROLLOUT4_LR5e-7_w0.3_linear is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via mergekit, combining two distinct pre-trained language models. This merging technique allows for the integration of different model strengths into a single, cohesive unit.

Merge Details

The model was created by merging two base models, with specific weighting applied:

The first base model received a weight of 0.3.
The second base model, /local/scratch/zli2255/workspace/MergeExpert/checkpoints/baselines_openrs/accfmt_MRL4096_ROLLOUT4_LR5e-7/global_step_54/actor/huggingface, received a weight of 0.7.

The merge process utilized bfloat16 for data types and included normalization, as specified in the configuration. This approach aims to leverage the combined capabilities of the constituent models.

Key Characteristics

Architecture: Merged model using the Linear method.
Parameter Count: 1.5 billion parameters.
Context Length: Supports a substantial context window of 131072 tokens, enabling processing of long sequences.

Potential Use Cases

This model is suitable for tasks that benefit from a merged architecture, potentially offering improved performance over its individual base models. Its large context window makes it particularly useful for applications requiring deep contextual understanding or processing of extensive documents.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)