Name: Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR2e-6_w0.9_linear API
Brand: Featherless.ai
Price: 10.00 USD
Availability: InStock
Author: Zachary1150

Model Overview

Zachary1150/merge_lenfmt_MRL4096_ROLLOUT4_LR2e-6_w0.9_linear is a 1.5 billion parameter language model developed by Zachary1150. It was constructed using the Linear merge method via the mergekit tool, combining two distinct pre-trained base models. This approach aims to synthesize the capabilities of its constituent models into a single, more versatile model.

Merge Details

The model integrates two base models, with specific weighting applied during the merge process:

One base model received a weight of 0.9.
The second base model received a weight of 0.1.

This configuration suggests an emphasis on the characteristics of the first base model while incorporating aspects of the second. The merge was performed with normalize: true and dtype: bfloat16 settings, indicating careful attention to numerical stability and performance during the merging process.

Key Characteristics

Architecture: Merged language model.
Parameter Count: 1.5 billion parameters.
Context Length: Supports a very long context window of 131072 tokens, enabling processing of extensive inputs.

Potential Use Cases

This model is suitable for applications where combining the strengths of different specialized base models is beneficial, particularly in scenarios demanding a large context window. Its merged nature suggests it might excel in tasks that require a blend of capabilities from its constituent models, such as complex reasoning, long-form content generation, or detailed analysis over extended texts.

Overview

Model Overview

Merge Details

Key Characteristics

Potential Use Cases

Full Model Card (README)